I found myself recently having to deal with a large and complex XML document. I wanted to use Python's excellent paradigms like generators and list comprehension to carve up the XML into more meaningful data. However, most libraries that deal with XML parsing provide you with non-standard interfaces to access XML nodes due to the complexity of the XML spec.
While searching for a solution, I came across the xmltodict library. It turned out to be exactly what I needed. I was surprised at the intuitiveness of the API. Essentially, give it an XML document and it gives you a native Python dictionary with intuitive mappings for XML specific features like attributes and namespaces. For example, lists in the form of
<item>1</item><item>2</item><item>3</item> simply become
item: [1, 2, 3]. Once it is in Python's dictionary model, it is trivial to get it to convert to JSON due to their similarity.
Below is a working snippet of a few test cases I came up with to demonstrate its functionality.