Aaron Parecki
|
470639f486
|
recognize h-event "content" in addition to "description"
|
6 years ago |
Aaron Parecki
|
43db6098fc
|
handle the case where the server returns multiple content-type headers
|
6 years ago |
Aaron Parecki
|
7252d5a3f4
|
also parse the object inside Create activities
|
6 years ago |
Aaron Parecki
|
ca9c8c02ef
|
AS: parse likes and reposts
|
6 years ago |
Aaron Parecki
|
85d973916f
|
support articles and summary
|
6 years ago |
Aaron Parecki
|
d3e36038b2
|
parse basic ActivityStreams objects
including from rel=alternate
|
6 years ago |
Aaron Parecki
|
154b7e874a
|
check for a rel=alternate to existing parsed mf2 JSON and use that instead
|
6 years ago |
Aaron Parecki
|
70f1576926
|
support twitter animated gifs
|
6 years ago |
Aaron Parecki
|
112b75b623
|
parse quotation-of from HTML as well
closes #73
|
6 years ago |
Aaron Parecki
|
417cc1b3cc
|
parse redirect uri for h-app
parse from both link tags and the u-redirect-uri property
|
6 years ago |
Aaron Parecki
|
6f39655c8a
|
parse instagram user info from HTML instead of secret JSON API
adds script to refresh the downloaded instagram data for the tests as well
|
6 years ago |
Aaron Parecki
|
c70b29479a
|
updates for instagram parsing
instagram seems to have rolled out the `graphql` key everywhere now
|
6 years ago |
Aaron Parecki
|
25b6f85c14
|
use html5 parser and update php-mf2
|
6 years ago |
Aaron Parecki
|
4959ec15f2
|
remove duplicate url values
|
6 years ago |
Aaron Parecki
|
8026279cba
|
fix tests for new mf2 parser
main difference is the deprecated rel handling
|
6 years ago |
Aaron Parecki
|
a50cd6284b
|
fix whitespace handling for br tags in html
|
6 years ago |
Aaron Parecki
|
c27f228314
|
include in-reply-to URL for tweets
|
6 years ago |
Aaron Parecki
|
c68c7661c8
|
inspect content to determine if a page is atom or rss
closes #62
|
6 years ago |
Aaron Parecki
|
cb1e32278d
|
convert newlines to <br> for html in tweets
|
6 years ago |
Aaron Parecki
|
bf4bc3a668
|
extract photos and videos from streaming tweets when truncated
|
6 years ago |
Aaron Parecki
|
fb2fcec9c6
|
include HTML for tweets with links or user mentions
also expands parsing to be able to handle twitter JSON from the streaming API which is subtly different from the HTTP API.
closes #61
|
6 years ago |
Aaron Parecki
|
584f34e1ed
|
add test from ascraeus.org which was causing an INTL error
|
6 years ago |
Aaron Parecki
|
2cc215d370
|
add .editorconfig to data folder
tells the editor to save data files with crlf needed for parsing the test http responses
|
6 years ago |
Aaron Parecki
|
aba067234c
|
add h-x-app vocabulary
closes #13
|
6 years ago |
Aaron Parecki
|
fe65def90f
|
comment out two tests until open mf2 parser issues are resolved
|
7 years ago |
Aaron Parecki
|
2515f618c7
|
include featured image for h-entry
closes #51
|
7 years ago |
Aaron Parecki
|
4d65b1ca1e
|
if removing the img results in empty content, put the name value back
closes #57
|
7 years ago |
Aaron Parecki
|
3ac38f9dbf
|
add simple case of Known markup
for #57
|
7 years ago |
Aaron Parecki
|
85c2b9b15f
|
add failing test for `p-content` containing an `u-photo`
|
7 years ago |
Aaron Parecki
|
44770396f9
|
add test to ensure a content property is not returned unless it is defined
|
7 years ago |
Aaron Parecki
|
bdedef6e1e
|
adds a bunch of broken tests for #52
|
7 years ago |
Aaron Parecki
|
a9b1001e62
|
switch to fork of picofeed with authorUrl support
* adds test of instagram-atom feed with individual authors per item
* dedupes atom/rss title if it's a prefix of the content
|
7 years ago |
Aaron Parecki
|
7872429f0c
|
prioritize url on the same domain
if an item has multiple URL values, return the one that is on the same domain
|
7 years ago |
Aaron Parecki
|
206e27ea25
|
add feed discovery API
|
7 years ago |
Aaron Parecki
|
85b8a35212
|
normalize URLs when comparing
Treats `https://example.com` and `https://example.com/` as equivalent when comparing URLs. Closes #33
|
7 years ago |
Aaron Parecki
|
15743d411d
|
Find author when author is a property of the h-feed
closes #32
|
7 years ago |
Aaron Parecki
|
05f7d9c86c
|
implement h-feed and other microformats feed parsing
|
7 years ago |
Aaron Parecki
|
7b16371418
|
add basic support for JSONFeed
|
7 years ago |
Aaron Parecki
|
e8e63caba6
|
implements parsing Atom and RSS feeds
|
7 years ago |
Aaron Parecki
|
a37ed3bbae
|
update to support multiple photos
uses the video's poster frame as the photo if any of the multi-post images are videos
|
7 years ago |
sebsel
|
6b286157e3
|
based tests on TwitterTest.php
|
7 years ago |
sebsel
|
67c159ec29
|
added tests
|
7 years ago |
Aaron Parecki
|
d50231142a
|
adds support for parsing checkins
checkin data is returned embedded like author data rather than in the `refs` object
closes #35
|
7 years ago |
Aaron Parecki
|
4fab3e9e0a
|
add test for HN comment
|
7 years ago |
Aaron Parecki
|
d0de523746
|
add hackernews support
closes #40
|
7 years ago |
Aaron Parecki
|
330bc9024d
|
fix parsing for hReview
thanks to the new backcompat in php-mf2 0.3.2
|
7 years ago |
Aaron Parecki
|
b76d72a77b
|
return issue labels as category
|
7 years ago |
Aaron Parecki
|
f8e9a87667
|
parse github issues and comments
closes #20
|
7 years ago |
Aaron Parecki
|
5f63ed7944
|
updates for instagram scraping
|
7 years ago |
Aaron Parecki
|
63ab3031a3
|
parse XKCD comics
skip image alt text for now
closes #34
|
7 years ago |