$ mkdir pyquery_tutorial $ cd pyquery_tutorialNow create the virtual environment with your Python executable version of choice (I have only tested this for 2.6 and 2.7)
$ virtualenv env --python=python2.6 Running virtualenv with interpreter /usr/bin/python2.6 New python executable in env/bin/python2.6 Also creating executable in env/bin/python Installing distribute.................................................................................................................................................................................done.Now activate the virtualenv. (You should see (env) beside your prompt if done correctly)
$ . env/bin/activateNow we install the PyQuery package.
(env) $ pip install pyquery ... Successfully installed lxml pyquery Cleaning up...Woohoo, PyQuery is now ready for use!
<!DOCTYPE html> <html> <head> <title>PyQuery Test!</title> </head> <body> <h1>PyQuery is AWESOME!</h1> <p><a href="http://pypi.python.org/pypi/pyquery">PyQuery</a> is a Python port of the famous <a href="http://jquery.com">jQuery</a> JavaScript library. <h2>What is it Good For?</h2> <ul id="pitch"> <li>It makes parsing files a <strong>SNAP</strong>!</li> <li>DOM Manipulation is EASY!</li> <li>You <em>never</em> have to worry about confusing syntax</li> </ul> </body> </html>Now fire up Python. (Make sure your virtualenv is still activated!)
$ python Python 2.6.6 (r266:84292, Mar 25 2011, 19:36:32) [GCC 4.5.2] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>>First we import PyQuery from the pyquery package.
>>> from pyquery import PyQueryNow let's read in our index.html file and store it to a string.
>>> html = open("index.html", 'r').read()Now we instantiate a PyQuery object, passing in our html string. To keep things looking familiar, let the instantiating object be named jQuery!
>>> jQuery = PyQuery(html)Now we can traverse this document using the selectors we've grown to love through CSS and jQuery. It might look strange that we're assigning jQuery to something, but at this point, we use this jQuery variable JUST like we use $ in our JavaScript. For example, let's get the title tag.
>>> jQuery("title").text() 'PyQuery Test!'jQuery developers that have created their own plugin may already be comfortable using jQuery in place of $ in their JS. Let's mess around with PyQuery some more.
>>> jQuery("li").eq(1).text() 'DOM Manipulation is EASY!' >>> jQuery("a") # The 'jQuery Object' we're used to is now a list [<a>, <a>] >>> for x in jQuery("a"): # We can do for-loops as normal in Python ... print jQuery(x).text() ... PyQuery jQueryGet the HTML of the first li element.
>>> jQuery("ul").children().eq(0).html() u'It makes parsing files a <strong>SNAP</strong>!'
>>> jQuery = PyQuery(url="http://www.vertstudios.com/") >>> jQuery("title").text() "Web Design that Doesn't Suck | Vert Studios | Tyler, Texas"
Now that we've given you a nice kickstart of PyQuery, your knowledge of jQuery coupled with the PyQuery API provides sufficient power to parse XML/HTML documents. September 20, 2011
Ref: http://vertstudios.com/blog/pyquery-tutorial-basic-html-parsing-pyquery/