snarfed.org

draw group stream of consciousness

snipscrape

Wed, 01 Jan 2003 [comments (0)] [history] [rdf] [raw]

/space/scraper.jpg

For the impatient:
Download snipscrape 0.2
View snipscrape.xslt
SnipSnap web site

Snipscrape is a screen scraper for SnipSnap. It enables you to import pages from one SnipSnap site into another, or to recover content from SnipSnap-generated HTML pages. It does this by transforming HTML pages generated by SnipSnap into XML that can be imported.

Snipscrape is implemented in XSLT, and comes with a shell script for *nix platforms that does some useful preprocessing. You can use snipscrape without the shell script, but you'll need to perform a few tasks by hand, such as escaping special characters.

Assuming you'd like to scrape two pages, snip1.html and snip2.html, here's what you'd do:

$ snipscrape.sh snip1.html snip2.html > snips.xml


Now, go to the manager page in your SnipSnap, import snips.xml, and you should be set!

See also:

Post a comment...



Simple HTML and wiki markup are allowed.