From the book 《Modern Authentication with Azure Active Directory for Web Applications》
SAML
The Security Assertion Markup Language, SAML for short, appeared on the scene mostly for handling this very problem. Its origin dates back to the early 2000s as a concerted effort of various industry players that wanted to establish an interoperable solution to the SSO problem. SAML 2.0 is the most widely adopted version, with some systems (especially those in academia) still on 1.1. Although SAML touches on how to secure web services and lots of other scenarios, its most widely adopted use case is web browser–based SSO, and that’s what I’m going to focus on.
Although both Azure Active Directory and Active Directory Federation Service (ADFS) (from version 2 onward) support SAML, the .NET Framework does not offer any classes out of the box for building applications that understand the protocol. Developing with the .NET Framework is the main focus of this book, so even if I provided a detailed description of how SAML works, it would not be very actionable for you. However, the importance of SAML as a framing reference for identity problems cannot be overstated. Moreover, a good chunk of the jargon you’ll encounter comes straight from SAML. Learning the basics is a good investment for any beginner in this space.
In a nutshell, SAML sidesteps the shortcomings of domain-bound cookies by, you guessed it, adding an extra abstraction layer. Instead of relying on browser automatisms, SAML introduces a sequence of application-level messages that enable an application to send authentication requests and obtain tokens that can be sent across domains. Once those tokens successfully cross domain boundaries, they can be validated by the target app and used to initialize a session with the new domain. I’ll unpack the scenario as soon as I define more terminology to work with.
SAML follows precisely the blueprint introduced in the claims-based identity section. Let’s draw some correspondences between the abstract entities defined in the general meta protocol and concrete artifacts from SAML.
Roles
I am sure you noticed that the sample scenario I introduced earlier contained one entity playing the role of the IdP (that was airline.example.com and its profile store). The good news is that in SAML, IdPs are called . . . IdPs.
In the terminology of claims-based identity, the cars.example.com.uk application is called an RP. In SAML, it is known as a service provider, or SP. Another important role is the subject, the entity that is meant to be authenticated. In the vast majority of cases, that’s simply the user. SAML also describes other roles, but the ones I’ve enumerated suffice for the purposes of this book.
Artifacts
SAML is guilty of having introduced not one but two widely successful technologies: the protocol it defines and the specific token format that the protocol’s messages exchange. I say “guilty” facetiously: people commonly refer to both technologies with the same term, “SAML,” which has caused confusion for the past decade or so. When somebody states, “My app supports SAML,” you always have to ask for clarification: “The protocol or the token format?”
In SAML parlance, tokens are called assertions. They follow the exact token semantic described in the preceding section: they are a vessel for the IdP’s assertions about the user (excuse me), the subject. And they are signed.
The SAML acronym, together with the epoch in which it was conceived, probably already gave away that SAML assertions are based on XML. In fact, the entire specification defines everything in terms of XML. That leads to a very expressive, powerful format that can represent pretty much anything. However, all that expressivity comes with various drawbacks. The main one is that XML is very verbose, which leads to big tokens. Furthermore, in XML, the same document can be expressed in multiple equivalent representations, and that flexibility becomes a problem when you need to perform signatures, where two elements listed in a different order can break a signature verification. Those are the main reasons that you won’t encounter SAML assertions in modern protocols later in the book, apart from cases in which they are used to bridge existing solutions to new ones.
It is tempting for me to use the SAML token structure to start entering into the mechanics of how claims are defined, tokens are scoped, and signatures are applied, but, as I said, SAML is not at the core of the modern protocols that are the main focus of this book. Those explanations will have to wait until a bit later.
Another important artifact defined by SAML is the format of its metadata documents. You already encountered the idea of IdP metadata in the section on claims-based identity. SAML goes well beyond that: it defines an XML-based format that can be used for describing endpoints, identifiers, and keys for IdPs, SPs, and many other entities.
Messages
SAML defines lots of different messages that support various sign-in flows, from the one triggered by an unauthenticated request to an SP (similar to what’s described in the claims-identity section), to one in which the IdP itself initiates a sign-on with a given SP. One interesting fact is that besides signing its assertions, SAML often mandates that messages themselves need to be signed as well.
The other interesting category of SAML messages, Single Logout, focuses on providing a mechanism to propagate a sign-out operation to all the applications participating in an SSO session. SAML defines many other messages for various other operations, which I won’t mention here.
Status
SAML has had an impressive ride from its first versions in the early 2000s. It’s still going strong in many of today’s SSO deployments in enterprises, government, and education. SAML is widely supported in SSO products, developer libraries (across platforms and languages), and cloud services. For many of those products, the SAML functionality is the centerpiece of their offering. As I mentioned, Active Directory itself (both ADFS from version 2 onward and Azure AD) supports it. On the software vendor side, many applications in active development today use SAML, including software as a service (SaaS) apps. The protocol is alive and well.
That said, if you are starting to develop a new solution, SAML might not be your best choice. Although really well suited for solving the cross-SSO domain problem and bringing lots of good features to the table, SAML does not offer the flexibility for addressing the challenges of the modern topologies I will introduce later in this chapter. Furthermore, its own richness translates into expensive requirements in term of cryptography and bandwidth that are not proportionate to the actual needs of modern applications. I won’t go so far as to say that SAML is dead, as was fashionable to say in identity circles a couple of years ago, but it is certainly no longer the recipient of innovation. I believe it will be around for a long time still, but mostly as a bridge to existing systems.