Aaron Parecki 
							
						 
						
							
							
							
								
							
								9054b0947c 
								
							
								 
							
						 
						
							
							
								
								specific error when there is no content at the URL  
							
							
								
							
							
						 
						9 years ago  
				
					
						
							
							
								
									
								
								Aaron Parecki 
							
						 
						
							
							
							
								
							
								1924d1000e 
								
							
								 
							
						 
						
							
							
								
								add log messages to debug which case a URL is hitting  
							
							
								
							
							
						 
						9 years ago  
				
					
						
							
							
								
									
								
								Aaron Parecki 
							
						 
						
							
							
							
								
							
								b7f49a7958 
								
							
								 
							
						 
						
							
							
								
								fix should follow redirects check  
							
							
								
							
							
						 
						9 years ago  
				
					
						
							
							
								
									
								
								Aaron Parecki 
							
						 
						
							
							
							
								
							
								8dc0caa4d0 
								
							
								 
							
						 
						
							
							
								
								use effective URL after following redirects when comparing URLs  
							
							
								
							
							
						 
						9 years ago  
				
					
						
							
							
								
									
								
								Aaron Parecki 
							
						 
						
							
							
							
								
							
								162d2f5ef8 
								
							
								 
							
						 
						
							
							
								
								add tests for feeds, catch case when a permalink has other h-entrys  
							
							
								
							
							
						 
						9 years ago  
				
					
						
							
							
								
									
								
								Aaron Parecki 
							
						 
						
							
							
							
								
							
								e3000f8c06 
								
							
								 
							
						 
						
							
							
								
								better blacklist for google URLs  
							
							
								
							
							
						 
						9 years ago  
				
					
						
							
							
								
									
								
								Aaron Parecki 
							
						 
						
							
							
							
								
							
								c4b80506da 
								
							
								 
							
						 
						
							
							
								
								support parsing posted HTML  
							
							
								
							
							
						 
						9 years ago  
				
					
						
							
							
								
									
								
								Aaron Parecki 
							
						 
						
							
							
							
								
							
								8d1489bb72 
								
							
								 
							
						 
						
							
							
								
								fix for target param. include bookmark-of property  
							
							
								
							
							
						 
						9 years ago  
				
					
						
							
							
								
									
								
								Aaron Parecki 
							
						 
						
							
							
							
								
							
								075f78a6c1 
								
							
								 
							
						 
						
							
							
								
								parse h-entry even if it's not the first objet  
							
							
								
							
							
						 
						9 years ago  
				
					
						
							
							
								
									
								
								Aaron Parecki 
							
						 
						
							
							
							
								
							
								d7672df96c 
								
							
								 
							
						 
						
							
							
								
								allow ul/li/ol  
							
							
								
							
							
						 
						9 years ago  
				
					
						
							
							
								
									
								
								Aaron Parecki 
							
						 
						
							
							
							
								
							
								e3ff109b37 
								
							
								 
							
						 
						
							
							
								
								restrict matching mf2 classes to only lowercase names  
							
							see http://microformats.org/wiki/microformats2-parsing-issues#ignore_u-camelCase_properties  for background 
							
						 
						9 years ago  
				
					
						
							
							
								
									
								
								Aaron Parecki 
							
						 
						
							
							
							
								
							
								66a9b1cc9e 
								
							
								 
							
						 
						
							
							
								
								sanitize HTML in the entry  
							
							allow only a basic set of tags, and remove any non-mf2 classes
closes  #2  
							
						 
						9 years ago  
				
					
						
							
							
								
									
								
								Aaron Parecki 
							
						 
						
							
							
							
								
							
								241594dcf5 
								
							
								 
							
						 
						
							
							
								
								sanitize HTML  
							
							sanitize the HTML returned in the content property. allows a common set of HTML tags.
for #2  
							
						 
						9 years ago  
				
					
						
							
							
								
									
								
								Aaron Parecki 
							
						 
						
							
							
							
								
							
								b9c9a6bddd 
								
							
								 
							
						 
						
							
							
								
								fix for author parsing  
							
							
								
							
							
						 
						9 years ago  
				
					
						
							
							
								
									
								
								Aaron Parecki 
							
						 
						
							
							
							
								
							
								ac6d86c0db 
								
							
								 
							
						 
						
							
							
								
								includes nested h-cite and other objects  
							
							if a property such as `in-reply-to` is an h-cite, the URL is still returned as the `in-reply-to` value, and the h-cite object is available in a different part of the response.
closes  #6  
							
						 
						9 years ago  
				
					
						
							
							
								
									
								
								Aaron Parecki 
							
						 
						
							
							
							
								
							
								ed88b4881b 
								
							
								 
							
						 
						
							
							
								
								use file_get_contents only for appengine URLs  
							
							
								
							
							
						 
						9 years ago  
				
					
						
							
							
								
									
								
								Aaron Parecki 
							
						 
						
							
							
							
								
							
								e09ee58d8b 
								
							
								 
							
						 
						
							
							
								
								sometimes it returns "request failed"  
							
							file_get_contents is dumb. I hope this isn't a permanent solution. 
							
						 
						9 years ago  
				
					
						
							
							
								
									
								
								Aaron Parecki 
							
						 
						
							
							
							
								
							
								2924f35e0d 
								
							
								 
							
						 
						
							
							
								
								fix tests for new HTTPStream  
							
							
								
							
							
						 
						9 years ago  
				
					
						
							
							
								
									
								
								Aaron Parecki 
							
						 
						
							
							
							
								
							
								82931e46bc 
								
							
								 
							
						 
						
							
							
								
								switch to using file_get_contents for appengine  
							
							
								
							
							
						 
						9 years ago  
				
					
						
							
							
								
									
								
								Aaron Parecki 
							
						 
						
							
							
							
								
							
								7b955b53f2 
								
							
								 
							
						 
						
							
							
								
								don't follow redirects on appengine URLs  
							
							see https://cloud.google.com/appengine/docs/php/urlfetch/  
							
						 
						9 years ago  
				
					
						
							
							
								
									
								
								Aaron Parecki 
							
						 
						
							
							
							
								
							
								7fafb51e92 
								
							
								 
							
						 
						
							
							
								
								add todo note for feeds  
							
							
								
							
							
						 
						9 years ago  
				
					
						
							
							
								
									
								
								Aaron Parecki 
							
						 
						
							
							
							
								
							
								7075254d56 
								
							
								 
							
						 
						
							
							
								
								add / to URL if it doesn't have a path  
							
							
								
							
							
						 
						9 years ago  
				
					
						
							
							
								
									
								
								Aaron Parecki 
							
						 
						
							
							
							
								
							
								0d96cb2832 
								
							
								 
							
						 
						
							
							
								
								also return matching url for h-cards  
							
							
								
							
							
						 
						9 years ago  
				
					
						
							
							
								
									
								
								Aaron Parecki 
							
						 
						
							
							
							
								
							
								fff43444f5 
								
							
								 
							
						 
						
							
							
								
								also return categories  
							
							
								
							
							
						 
						9 years ago  
				
					
						
							
							
								
									
								
								Aaron Parecki 
							
						 
						
							
							
							
								
							
								69223cad1d 
								
							
								 
							
						 
						
							
							
								
								return matching author url  
							
							
								
							
							
						 
						9 years ago  
				
					
						
							
							
								
									
								
								Aaron Parecki 
							
						 
						
							
							
							
								
							
								e9bc4bf450 
								
							
								 
							
						 
						
							
							
								
								rename to X-Ray  
							
							
								
							
							
						 
						9 years ago  
				
					
						
							
							
								
									
								
								Aaron Parecki 
							
						 
						
							
							
							
								
							
								0b35b74636 
								
							
								 
							
						 
						
							
							
								
								implement authorship discovery  
							
							* extracts mf2 post contents from pages
* implements authorship discovery to find author info for the URL 
							
						 
						9 years ago  
				
					
						
							
							
								
									
								
								Aaron Parecki 
							
						 
						
							
							
							
								
							
								9eecc31571 
								
							
								 
							
						 
						
							
							
								
								parse content and name from the entry  
							
							
								
							
							
						 
						9 years ago  
				
					
						
							
							
								
									
								
								Aaron Parecki 
							
						 
						
							
							
							
								
							
								13bb06d2c9 
								
							
								 
							
						 
						
							
							
								
								stub mf2 parsing  
							
							
								
							
							
						 
						9 years ago  
				
					
						
							
							
								
									
								
								Aaron Parecki 
							
						 
						
							
							
							
								
							
								85c3ce7b33 
								
							
								 
							
						 
						
							
							
								
								starting the parse function, with tests  
							
							
								
							
							
						 
						9 years ago  
				
					
						
							
							
								
									
								
								Aaron Parecki 
							
						 
						
							
							
							
								
							
								22a71fd7e9 
								
							
								 
							
						 
						
							
							
								
								empty project template  
							
							
								
							
							
						 
						9 years ago