转自:http://www.pixelmonkey.org/2012/06/14/web-app
Wanna build a web app fast? Know a little bit about programming but want to build a modern web app using two well-supported, well-documented, and universally accessible languages? You’ll love these Python, HTML/CSS, and JavaScript resources.
I’ve been sharing these documents with friends who ask me, “I want to start programming and build a web app, where do I start?”. These resources have also been useful to existing programmers who know C, C++ or Java, but who want to embrace dynamic and web-based programming.
Python Resources
Python is the core programming language used at Parse.ly. It also happens to be a quickly-growing language with wide adoption in the open source community, and it is a very popular choice for web startups.
I’ve written a blog post with some original materials for learning Python, import this — learning the Zen of Python with code and slides.
This is a good starting point, but you may also find these resources very helpful:
- For absolute beginners, “Learn Python the Hard Way”. This teaches Python using a series of programming examples, but it really assumes you have no programming background whatsoever. After going through the examples in LPTHW, it may be a good idea to supplement your understanding with Think Python.
- For existing programmers, “Dive into Python 3″. This teaches Python from the starting point that you have already programmed in a mainstream language like C or Java, and want to know what makes Python really cool/good. Similar audience to my “Zen of Python” slides. Note that this tutorial teaches Python 3, but most people still use Python 2.7. See Python2orPython3 on Python wiki to see the differences.
- For advanced programmers, “Python Essential Reference, 4th Edition”. Unfortunately, this book costs money, but it’s basically the best book on Python on the market, and it’s very up-to-date. It’s very dense and weighs in at 717 pages, so this is only for those who want to go deep on Python.
- For cheap advanced programmers, “Official Python Tutorial”. Though the Python tutorial doesn’t have the best narrative style nor the best real-world examples, for advanced programmers, it will teach the reality of the language in a comprehensible way. And, it’s free.
HTML/CSS Resources
In order to build up web applications, you’ll need to write your front-ends in HTML and CSS. These technologies have evolved over the years, but the basic principles remain from when they emerged nearly a decade ago. HTML is the markup language of the web, and you’ll see a lot of tutorials refer to HTML4, which is basically the markup standard all web browsers and websites work off. Don’t be confused by the HTML5 moniker, which often refers to much more than simply the markup — usually, it’s referring to a set of JavaScript APIs that are becoming standard in browsers, along with enhanced audio/video support and a few new “semantic markup” tags that have been added.
Since HTML is basically useless without CSS, you can get by with a short tutorial on HTML and then more advanced tutorials on CSS styling. Here’s what I recommend.
Learn the basics of HTML from MDC’s Introduction to HTML and Wikipedia’s page on HTML. This is a rare case where using Wikipedia is actually a perfect way to get the right background because half the battle with understanding HTML is understanding its history.
An excellent new guide to HTML & CSS together has been published by Shay Howe in 2013.
These look like a great first stop.
You can also use these dedicated resources for CSS specifically:
- For absolute beginners: Use W3C’s official tutorial on Starting with HTML + CSS. This was written all the way back in 2004, but provides the basics with screenshots and real code examples, so is a great way to get started.
- For existing programmers: Mozilla has done a great job putting together a quick and readable tutorial that gives you the basics at a glance.
- For advanced programmers: You’ll want to buy the best book on the subject, CSS Mastery. It has the best explanation of the box model and browser rendering engine’s that I’ve seen, and covers all the edge cases nicely.
- For cheap advanced programmers: You’ll need to look over the MDC (Mozilla) CSS Reference. Pay particularly close to articles on the Box Model and theVisual Formatting Model.
JavaScript Resources
Aside from Python, every Parse.ly engineer also knows JavaScript, even if it is only begrudgingly. For better or for worse, JavaScript has become the world’smost popular programming language.
JavaScript is definitely the language of the web. It is also a language that has, over the last few years, developed a nice bit of great documentation for learning the language. Here are some resources you can use to get up to speed:
- For absolute beginners: “Eloquent JavaScript” introduces you to both modern programming techniques and JavaScript at the same time. It is thus a great book for beginners. There is also a print version available.
- For existing programmers: The Mozilla Developer Network (MDN) contains the web’s best and most official documentation of HTML, CSS, and JavaScript. This guide, “A Re-Introduction to JavaScript”, presents the language to an audience that already knows how to program, and focuses specifically on the “gotcha” parts of the language.
- For advanced programmers: A must-read is the short (but costly) “JavaScript: The Good Parts”. Douglas Crockford basically reintroduced the world to JavaScript as a modern programming language. He is a bit of a curmudgeon when it comes to programming style, but this makes sense since he is also the author of JSLint, an important tool used in JS development for static code checking.
- For cheap advanced programmers: Douglas Crockford, author of the above “Good Parts” book, has also given a series of public video lectures on JavaScript at Yahoo! headquarters. These are freely available online and actually present much of the same content in “Good Parts”, just in a condensed form. Warning for the cheap: though the videos are very good, the book goes into more depth and spends less time on the history of the language. Also, Matt Might’s JavaScript, Warts and workarounds is an excellent summary to some of the most important “bad parts” of JavaScript.
JavaScript “frameworks”
Though knowing JS is important to do anything web-facing, you can also leverage some frameworks to help you out. The ones I recommend are the venerable jQuery JavaScript library and the Twitter Bootstrap HTML/CSS/JavaScript components. See:
jQuery adds common utilities for DOM manipulation, server requests, basic animations and dynamic CSS. Bootstrap builds on jQuery and adds a common, simple UI component library using pure HTML, CSS and JavaScript. This provides a grid system for layout; nicely-designed stylesheets for typography, tables, lists, and buttons; JavaScript components that add dynamic behavior such as tabs, dropdowns, modal dialogs, navigation bars, and more.
OK: take a deep breath. You’re learning the building blocks of a modern web application: backend / frontend programming languages and their associated code libraries. Let’s aim to solidify this knowledge using modern web frameworks.
Putting it all together: Python web frameworks
If you already know the basics of Python, HTML/CSS, and JavaScript, there are only a few more things you need to know to get basic web applications working. These are: view functions, template engines, databases, and a web server.
To understand the needs for these, let’s think about what happens when you type a search into Google’s search box. Google has some code running on its servers that retrieves pages, related searches, and advertisements that match your query (e.g. “best Python web frameworks“). It then renders a search engine result page that shows ads and documents that matched your query as an HTML/CSS page with a little JavaScript.
Put simply, a view function lets a web developer respond to user queries and interactions with dynamically rendered response pages. Typically, a view function will query a database, which is where persistent data may live that the user is aiming to retrieve. There are a slew of database technologies and depending on the requirements of a web application, they may combine several database technologies to respond to requests. It will then take the retrieved information and render it into a page that the user can view. This rendering process is the job of the template engine, which is able to plug dynamic values into page templates. Google likely has a single result page template, but depending on your query (and potentially, user profile data), the template will be populated with different results and advertisements. Finally, the web server is a piece of software that receives the requests (e.g. responds to google.com and to the URL for searching), executes the view functions, and returns the responses to the browser. The web server is like the glue that binds everything together.
To summarize: view functions are bits of Python code that are bound to handling web requests and doing something in response. Template engines are systems that allow rendering dynamic HTML/CSS/JavaScript depending on data accessed by the view function. Databases are stores of persistent data that are typically queried by view functions and passed along to template engines for plugging those values into parts of the page. And web servers put it all together.
Now that you understand these basics, you have to face an unfortunate truth: lots of different web frameworks exist that provide this functionality.
Since you only want to develop web apps fast, I’m only going to briefly cover three of these frameworks, and their relative trade-offs. These are: Django, Tornado, and Flask.
Django
Django is, by far, the most popular web framework for Python. It has excellent documentation and is very opinionated in how you should structure your web application. There are also a number of books written about it and a slew of open source modules and extensions.
Django has been used for a number of use cases: enterprise software-as-a-service web applications; consumer-facing, page-oriented software; rapid web application prototypes; content management systems; the list goes on and on.
Let’s evaluate it on the important functionality areas above:
- View Functions: They are defined either as plain Python functions or classesdefined inside modules, typically a module called “views.py” living within a Django “application”, which is nothing more than a Python package that contains that file. They are mounted to certain URLs using a special URL dispatcher using regular expression patterns.
- Template Engine: Django has its own template engine designed to be user-friendly even to non-programmers. In this respect, the language is somewhat limited and quirky, and does not really re-use your knowledge of Python for templating. Many advanced programmers end up using an alternative template engine with Django, such as Jinja2.
- Database: Django is very opinionated about your database engine. It was written with the idea that everyone would use a SQL database system of some sort, such as MySQL, Postgres, or SQLite. It provides an object-relational mapper system, or ORM, which makes it easy to define new data storage objects through what are called Models. It also provides an excellent and customizable automatic admin interface that allows instance data to be created and managed using web-based interface, complete with support for search, filtering, bulk operations, and the like. Despite these advantages, the Django ORM is derided as being a poorer codebase with a worse architecture than the more widely respected SQLAlchemy project.
- Web Server: There is no web server bundled in Django, save a development server not meant to be used in production. This leaves it up to you to integrate Django with a number of WSGI-compliant web servers that are out there, including Apache, nginx, gunicorn, and others.
At Parse.ly, we use Django for our main web application, but swap the default template engine for Jinja2. Though we have a Postgres database that benefits a bit from Django’s ORM and admin interface, the bulk of our data is stored in MongoDB, Redis, and Solr, and thus does not leverage the ORM at all. (See my related article, “On multi-form data”, for an explanation of why we combine databases.) Further, for other parts of our system that require access to our Postgres DB, we use SQLAlchemy. We run Django under nginx and uwsgi.
Tornado
Tornado is a web framework that was released by Facebook after its acquisition of Friendfeed. It has a significant architectural difference from Django in that it is built to solve the C10k problem: the challenge of building web servers to handle thousands of simultaneous web connections at one time. As a result, it bundles its own web server and expects you to use it.
Traditional web frameworks like Django expect that every web request will be handled by a separate web server thread. With thousands of simultaneous connections, this can overwhelm your web server with excessive memory usage, causing the server to slow down or even crash. Tornado is written the same way as other asynchronous web servers like nginx and NodeJS. As a result, it has the same scaling benefits: it can handle thousands of concurrent requests while keeping memory of your server stable.
This architectural difference has ramifications throughout your codebase, however. Tornado view functions tend to look different, and the usage of databases tends to be entirely different, too. So this isn’t the best choice for beginners, unless you know for a fact that your application is going to involve lots of concurrent connections from the get-go. Examples of this include: web chat systems, telephony applications, API servers, mobile backends, or some classes of “real-time” web applications.
Tornado has a nice overview document intended for beginners. The recent O’Reilly book, Introduction to Tornado, is also an excellent (and quick) read that goes through most the facilities available in the framework.
- View Functions. Tornado view functions are implemented via classes known as Handlers, which are subclasses of
tornado.web.RequestHandler
. Similarly to Django, there is a URL dispatcher called the Application that maps URL regex patterns to Handlers. Unlike Django view functions, Tornado view functions are not meant to do much work. The reason for this is that all view functions run in a single thread, and thus any long-running code will slow down your entire web server. Instead, the responsibility of the function is to delegate work to other asynchronous services handled by Tornado’s server. The primary candidate here is to have Tornado make an asynchronous HTTP request to some other service. There are also some databases and database drivers that are written in an “asynchronous” style which you can use reliably with Tornado, but in general, the idea is to avoid database queries in your view functions. - Template Engine: The approach to Tornado’s template engine is minimal and meant to optimize for your existing knowledge of Python. Templates are just HTML/CSS/JavaScript with special sequences that embed Python expressions inside them. Their template overview describes this well, and their reference page goes into more detail.
- Database: As mentioned earlier, Tornado doesn’t expect your view functions to hit a database often since this could slow your entire web server down. As a framework, it expects data querying to be “your problem”. There is a small wrapper for MySQL included, but this almost seems like an afterthought. Instead, I have seen most people put Tornado in front of other HTTP services that might be written using blocking frameworks like Django or Flask. I have also seen usage of async-friendly data stores such as MongoDB, CouchDB, and Solr. CouchDB and Solr both use HTTP as the client interface, so it is easy to hit these directly using Tornado’s built-in HTTP client. MongoDB has perhaps the best support: they shipped an official asynchronous driver called Motor, meant for use specifically with Tornado. Async drivers will likely become more common in the Python 3.x era, as Guido van Rossum (Python’s creator) is working on PEP 3156 to unify all of the async/event-driven Python frameworks.
- Web Server: Tornado bundles its own web server, which is perhaps the most powerful and convenient aspect of the framework. The beautiful thing about this is that the exact same web server you run locally for development is the one you can run in production.
Flask
Flask is the newest web framework of these. The author of the framework has aPyCon presentation explaining its motivation. Funny enough, it was built out of an April Fool’s joke where the author “zipped up” two of his existing projects — Jinja2(template engine) and Werkzeug (HTTP library) — and glued them together with a small Python file, thus declaring it a new web “microframework”.
The joke became a real open source project which is notable for its simplicity, respect of Python’s facilities, strong documentation, and ease of use. Due to its reliance on existing, high-quality Python modules, the actual web framework is only approximately 1,000 lines of code. The quickstart application requires only a single Python file which, when run, gives you a working development web server that renders a dynamic response. For all these factors and more, it is my preferred web framework for new web applications, especially those that wouldn’t benefit from Django’s admin interface or Tornado’s concurrent request scaling.
- View Functions: View Functions are as simple as it gets in Flask. They are simply plain Python functions. They are mounted to URL patterns using aPython Decorator called
route
. This includes support for Variable Rules which tend to be much more comprehensible compared to regex-based routes as in Django and Tornado. Similarly to Django, view functions in Flask are where the bulk of your application’s logic will go, including things like database queries. - Template Engine: Flask is meant to be used with Jinja2, an excellent andwell-documented template engine that is also widely used by Django developers as a drop-in replacement for that framework’s template engine. It strikes a balance between Django’s template language, meant to be understood by non-programmers (see Template Designer Documentation) while also having good interoperability with Python code and support for a wide range of control structures.
- Database: This is the least opinionated part of the Flask framework; it makes no recommendation as to what database to use, considering this to be beyond the scope of a core web framework. That said, the Flask Extension Registrycontains some modules that help integrate Flask with this or that database technology, such as Flask-SQLAlchemy (provides support for all SQL data stores) and Flask-PyMongo (provides support for MongoDB connections). However, you can just easily query databases by simply importing appropriate Python client libraries which often exist for that particular DB — and that is, indeed, “The Flask Way” of doing this.
- Web Server: Though no web server is bundled in the framework, it can be deployed even more easily than Django to any number of WSGI-compliant web servers, as described in the Deployment Options section of their documentation. The built-in debug-mode development server is extremely handy for local development, supporting full stack traces and even an embedded Python interpreter for inspecting the state of variables at the time of the web server crash.
Conclusion: pick a stack
This article has lots of resources that can help you pick a stack, but I have some opinions about how you can get started easily.
- Use Python 2.7. Python 3 isn’t fully ready yet, but will be soon.
- Target HTML4 and CSS2. Though you need to be aware of HTML5 and CSS3, the lion’s share of web development today targets the earlier versions of these standards, and the changes to them are mostly incremental.
- Use jQuery and Bootstrap. Among JavaScript frameworks, these will cover the largest number of use cases while having the most available online tutorials and examples.
- Start with Flask, switch as necessary. If you are just getting started with web development, you’ll be able to assemble an application with the above components easily in Flask. You may not know yet whether your application requires thousands of concurrent requests (as provided by Tornado) or whether you would benefit extensive open source plugins / a full-featured data model framework (as in Django). So defer those decisions to when you are better able to make them in an informed and careful way.
- Pick the simplest database possible, upgrade later. Since Flask doesn’t impose any database on you, you can choose to pick the simplest database that could possibly work. Some good candidates for your early days are SQLite,MongoDB, or Redis. As you start to understand your requirements more, you may need to upgrade to a full-fledged SQL RDBMS such as Postgres or a full-text index such as Solr. But you won’t know for sure until you fully understand the form of your data, so why lock into heavyweight solutions up-front? (see my related article, “On multi-form data”)
- Pick a host that matches your system administration skill level. You will need a Linux server available to you to deploy your app, and for that I suggest simple VPS providers such as Rackspace Cloud or Linode. You will hear a lot of people mention Amazon EC2 but I recommend you only switch to EC2 later once you understand its quirks and tradeoffs vs traditional providers (as well as its benefits). Setting your server up may require some knowledge of UNIX and system administration which is beyond you. If that is the case, you can consider a shared hosting provider such as Webfaction for your early days, which has good Python support and prefab deployment setups for Django, Flask, and nginx available. If you don’t mind paying a premium to have someone else host all your infrastructure for you, you may also want to consider a PaaS provider such as Heroku. Personally I don’t recommend these services for economic reasons, but they have many proponents.
- Develop locally, deploy simply. Each of these web frameworks have workflows for developing locally. You should use those until you get comfortable with the frameworks. That said, the day will come where you want to see your web application live on a real web server. For that moment, if you are developing with Django or Flask, I recommend you deploy to uWSGI behind nginx. Flask has docs for this, so does Django. To actually push your code to your servers, I recommend you start with extremely simple UNIX tools: rsync and ssh. With rsync, you can copy your Python project quickly to your server, and thanks to rsync’s incremental copy, it will only copy changed files. For example:
rsync -Pav myproject/ remoteserver:myproject/
. With ssh, you can execute remote commands on your server such as ssh remoteserver sudo restart nginx
to restart your nginx web server. Once you start to need fancier deployment options, you can upgrade to Fabric, a Python-based deployment tool, replacing your rsync command with project tools and replacing your ssh commands with calls to run(). - Don’t develop on Windows. This is unfortunate but true. The lack of support for UNIX tools on Windows puts it at a significant disadvantage for building modern web applications that deploy to Linux servers running Python. If you are running Windows on your development workstation and don’t feel you have a choice about the matter, you will want to investigate virtualization options to do your development under a Linux guest virtual machine. These includevirtualbox, vagrant, and VMWare.
I look forward to seeing the web apps you build and deploy.
—
UPDATE: I’ve converted this blog post into a full-blown tutorial. You can find the code, slides, and video in this post. Covers all my “suggested” technologies, like Twitter Bootstrap, jQuery, Flask, Jinja2, MongoDB, Fabric, nginx, supervisor, uWSGI, etc.
Do you want to do modern Python web development on a daily basis, working on some of the most interesting problems at the intersection of large-scale data analysis and information visualization? Check out Parse.ly — we’re hiring! Engineers work ideally in Eastern or Central Time Zone, as this is a remote position for our fully distributed team.
Andrew Montalenti (aka pixelmonkey, amontalenti) is the co-founder and CTO ofParse.ly, which provides data insights to the web's best publishers. You can follow him on Twitter or LinkedIn. If you liked this post, you should subscribe to his free e-mail newsletter, Fire and Motion, which discusses technology startups and business from the trenches.This entry was posted on Thursday, June 14th, 2012 at 4:15 pm and is filed under Open Source,Programming, Startups, Technology. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.