Scraping Alpha is a Scrapy-powered web spider that (might) trawl the Scraping Alpha website and collate earnings call transcripts in an SQL database. I'm not sure how well it works.
.gitignore | ||
firstfew.xml | ||
firstfew.xml~ | ||
firstfewbackup.xml | ||
patent_slurper.pl | ||
patentlast.xml | ||
patentsfirst.xml | ||
README.md | ||
step1.pl |
PatentSlurp
Patent slurper for Dr Lars Hass, LUMS
TODO
- Add stripper for redundant xml tags
- Harvest below data from Google dumps 2001-2015:
storage display value
variable name type format label variable label
sta str2 %2s assg/state cnt str3 %3s assg/country assgnum byte %8.0g assg/assignee seq. number (imc) cty str72 %72s assg/city pdpass long %12.0g Unique assignee number ptype str1 %9s patent type patnum long %12.0g patent number
- Compare data with NBER data (http://eml.berkeley.edu/~bhhall/NBER06.html)
- ...