Aaron Parecki
8dc0caa4d0
use effective URL after following redirects when comparing URLs
8 years ago
Aaron Parecki
162d2f5ef8
add tests for feeds, catch case when a permalink has other h-entrys
8 years ago
Aaron Parecki
e3000f8c06
better blacklist for google URLs
8 years ago
Aaron Parecki
c4b80506da
support parsing posted HTML
8 years ago
Aaron Parecki
8d1489bb72
fix for target param. include bookmark-of property
8 years ago
Aaron Parecki
075f78a6c1
parse h-entry even if it's not the first objet
8 years ago
Aaron Parecki
d7672df96c
allow ul/li/ol
8 years ago
Aaron Parecki
e3ff109b37
restrict matching mf2 classes to only lowercase names
see http://microformats.org/wiki/microformats2-parsing-issues#ignore_u-camelCase_properties for background
8 years ago
Aaron Parecki
66a9b1cc9e
sanitize HTML in the entry
allow only a basic set of tags, and remove any non-mf2 classes
closes #2
8 years ago
Aaron Parecki
241594dcf5
sanitize HTML
sanitize the HTML returned in the content property. allows a common set of HTML tags.
for #2
8 years ago
Aaron Parecki
b9c9a6bddd
fix for author parsing
8 years ago
Aaron Parecki
ac6d86c0db
includes nested h-cite and other objects
if a property such as `in-reply-to` is an h-cite, the URL is still returned as the `in-reply-to` value, and the h-cite object is available in a different part of the response.
closes #6
8 years ago
Aaron Parecki
ed88b4881b
use file_get_contents only for appengine URLs
8 years ago
Aaron Parecki
e09ee58d8b
sometimes it returns "request failed"
file_get_contents is dumb. I hope this isn't a permanent solution.
8 years ago
Aaron Parecki
2924f35e0d
fix tests for new HTTPStream
8 years ago
Aaron Parecki
82931e46bc
switch to using file_get_contents for appengine
8 years ago
Aaron Parecki
7b955b53f2
don't follow redirects on appengine URLs
see https://cloud.google.com/appengine/docs/php/urlfetch/
8 years ago
Aaron Parecki
7fafb51e92
add todo note for feeds
8 years ago
Aaron Parecki
7075254d56
add / to URL if it doesn't have a path
8 years ago
Aaron Parecki
0d96cb2832
also return matching url for h-cards
8 years ago
Aaron Parecki
fff43444f5
also return categories
8 years ago
Aaron Parecki
69223cad1d
return matching author url
8 years ago
Aaron Parecki
e9bc4bf450
rename to X-Ray
8 years ago
Aaron Parecki
0b35b74636
implement authorship discovery
* extracts mf2 post contents from pages
* implements authorship discovery to find author info for the URL
8 years ago
Aaron Parecki
9eecc31571
parse content and name from the entry
8 years ago
Aaron Parecki
13bb06d2c9
stub mf2 parsing
8 years ago
Aaron Parecki
85c3ce7b33
starting the parse function, with tests
8 years ago
Aaron Parecki
22a71fd7e9
empty project template
8 years ago