Setting Up Json

Jed Rembold

November 13, 2024

Announcements

  • Project 4: Enigma due next Monday night
    • Get a start if you haven’t already!
    • Sections today and tomorrow will help with milestones 3 and 4
  • I’ll have test feedback for you by Friday
    • New grade reports will be posted as soon as I have the test scores done
  • Graphics Contest winner: Nobody
    • The free 100% on any assignment goes unclaimed
    • The last contest of the semester will be a Game Contest. Instructions up by the end of the day.
  • Talks!
    • Women in Tech tonight!
    • Techbytes tomorrow, with guest speaker from industry!
  • Polling: polling.jedrembold.prof

Review Question

Let’s consider a greatly simplified Enigma machine, which only has one rotor that is not turning. So the signal goes through the rotor then the reflector and back through the rotor. Given the rotor and reflector mappings shown to the right, what would the word python encrypt to?

  1. aicmnz
  2. hnktge
  3. rfqbls
  4. zghpmy

Let’s Talk About Sets

Pythonic Sets

  • Enclosed within squiggly brackets
  • No key-value pairs, just single values separated by commas
digits = { 0, 1, 2, 3, 4, 6, 7, 8, 9 }
squares = { 0, 1, 4, 9 }
primary = { "red", "green", "blue" }
  • Set elements must be immutable
  • Sets themselves are generally mutable
  • Can not create an empty set just using { }!
    • Python assumes this to be an empty dictionary!
    • Must instead use set().

Set Operations

  • The fundamental set operation is membership (∈)
    • 3 ∈ primes
    • 3 ∉ evens
    • red ∈ primary
    • -1 ∉ N
  • The union of the sets \(A\) and \(B\) (\(A \cup B\)) consists of all elements in either \(A\) or \(B\) or both.
  • The intersection of the sets \(A\) and \(B\) (\(A \cap B\)) consists of all elements in both \(A\) and \(B\).
  • The set difference of \(A\) and \(B\) (\(A - B\)) consists of all elements in \(A\) but not in \(B\).
  • The symmetric set difference of \(A\) and \(B\) (\(A\triangle B\)) consists of all elements in \(A\) or \(B\) but not in both.

Python Implementations

  • Python’s built-in implementation of sets supports all these same operations
  • Can either use appropriately named methods on sets or operators between sets
  • Membership 3 in primes
  • Union: A.union(B) A | B
  • Intersection A.intersection(B) A & B
  • Difference A.difference(B) A - B
  • Symmetric difference A.symmetric_difference(B) A ^ B

Venn Diagrams

  • A Venn Diagram is a graphical representation of a set which indicates common elements as overlapping areas
  • The following Venn diagrams illustrate the effect of the 4 primary set operations

image/svg+xml A B A ∪ B
image/svg+xml A B A ∪ B A B A ∩ B B A - B A

image/svg+xml A B A ∪ B A B A ∩ B
image/svg+xml A B A ∪ B A B A ∩ B A B A - B A B A ∆ B

Practice

If we have the following sets from earlier:

digits = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 }
evens = { 0, 2, 4, 6, 8 }
odds = { 1, 3, 5, 7, 9 }
primes = { 2, 3, 5, 7 }
squares = { 0, 1, 4, 9 }

What is the value of each of the following:

  • evens ∪ squares
  • odds ∩ primes
  • primes - evens
  • odds ∆ squares

Understanding Check

Looking at the same sets:

digits = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 }
evens = { 0, 2, 4, 6, 8 }
odds = { 1, 3, 5, 7, 9 }
primes = { 2, 3, 5, 7 }
squares = { 0, 1, 4, 9 }

What is the set resulting from: \[ (\text{primes} \cap \text{evens}) \cup (\text{odds}\cap\text{squares})\]

  1. { 1, 2, 9 }
  2. { 1, 3, 4, 5}
  3. { 0, 3, 4, 5, 7}

Set Relationships

  • Sets \(A\) and \(B\) are equal (\(A = B\)) if they have the same elements.
    • This would make them the same circles in a Venn diagram
    • In Python: A == B
  • Set \(A\) is a subset of \(B\) (\(A\subseteq B\)) if all the elements in \(A\) are also in \(B\).
    • This would mean that the circle for \(A\) would be entirely inside (or equal) to the circle of \(B\)
    • In Python: A <= B
  • Set \(A\) is a proper subset of \(B\) (\(A\subset B\)) if \(A\) is a subset of \(B\) and the two sets are not equal
    • In Python: A < B

Python Set Methods

  • Can also use “set comprehension” to generate a set { x for x in range(0,100,2) }
Function Description
len(|||set|||) Returns the number of elements in a set
|||elem||| in |||set||| Returns True if |||elem||| is in the set
|||set|||.copy() Creates and returns a shallow copy of the set
|||set|||.add(|||elem|||) Adds the specified |||elem||| to the set
|||set|||.remove(|||elem|||) Removes the element from the set, raising a ValueError if it is missing
|||set|||.discard(|||elem|||) Removes the element from the set, doing nothing if it is missing

Why use sets?

Sets come up naturally in many situations
Many real-world applications involve unordered collections of unique elements, for which sets are the natural model.
Sets have a well-established mathematical foundation
If you can frame your application in terms of sets, you can rely on the various mathematical properties that apply to sets.
Sets can make it easier to reason about your program
One of the advantages of mathematical abstraction is that using it often makes it easy to think clearly and rigorously about what your program does.
Many important algorithms are described in terms of sets
If you look at websites that describe some of the most important algorithms in computer science, many of them base those descriptions in terms of set operations.

Compound Data

Representing Data

  • To use computation effectively, we frequently need to be able to represent real world data in a way that computers can easily work with
    • Real world data is often more complicated or nuanced than just “a list of numbers”
  • Python’s existing data structures are tools, which you can use to help represent certain ideas
    • Lists when you have sequential type data, wherein there is a logical ordering to the data in question (where position matters)
      • Example: GPA over the course of 4 years
    • Tuples or classes when you have elements that should be grouped together but which have no inherent ordering. Generally use tuples for simple records and write custom classes for more complex. Could potentially also use a dictionary or set.
      • Example: Student names in a class
    • Maps or dictionaries when you have specific keys corresponding to other values.
      • Example: Student grades

Tricky Data

  • Human readable data is not always the best machine-readable data!
Name Class Q1 Mid Q3 Final
Sally Python A B B A
Jake Python B B B C
James Astro B B A
Lily Astro A A B
Ben Python C B B A
  • Storing the above in a 2D array would work, but would be frustrating to work with

A Computer Friendly Approach

  • Student grades are time ordered, so we could use a list for the grades
  • Each student has a corresponding sequence of grades (and students are unordered), so we could use a dictionary where student names are the keys and the list of grades the values
  • Each class corresponds to an unordered set of students. Could have another dictionary where the keys were the class names and the values were the dictionary of students/grades

Example Representation

{
    "Python": {
        "Sally": ["A", "B", "B", "A"],
        "Jake": ["B", "B", "B", "C"],
        "Ben": ["C", "B", "B", "A"]
    },
    "Astro": {
        "James": ["B", "B", "A"],
        "Lily": ["A", "A", "B"]
    }
}

Compound Structure Storage

  • Structures representing complicated data can often be large enough that you don’t want to store them within your program itself
  • We can put them in their own file, but reading them in with our current tools would be complicated
    • Current methods read in text, so we would need to parse the text to identify what data structures we needed to create and what elements we needed to add
    • This is certainly possible, but potentially more overhead than what we would like for some structures
  • Useful then to store the data structure in file in such a format that it can be easily read into Python

File I/O

  • A variety of ways this can be done
    • XML, YAML, JSON
  • JSON is particularly interesting to us, because its syntax almost exactly matches Python’s (even though it was made for Javascript)
  • Python has a built-in library to read and write JSON files, just called json
    • json.load(|||file_handle|||)
      • Loads the JSON data structure from the specified file into its Python equivalent
    • json.dump(|||data_object|||, |||file_handle|||)
      • Writes a JSON text representation of the data object to the given file
    • Both methods are used inside our normal with open(|||filename|||) as |||fhandle|||: syntax

Using JSON

  • To read a JSON file into a variable data:

    import json
    with open('file.json') as fh:
        data = json.load(fh)
  • To write a variable with complex structure out to a JSON file:

    import json
    with open('file.json', 'w') as fh:
        json.dump(data, fh)

The Power of JSON

  • One very nice aspect of JSON is that it is often the defacto way that information is passed around the internet
  • This means it can be easy to find data providers where you can access or download information already in a JSON format
  • DND Fireball Spell info here
  • We could download this information to a file, which we could then read in and use within our Python program
  • Later we’ll also look at how we could process the information straight from the internet as well

JSON Gotchas

  • If you are writing JSON files from within Python or using files gotten elsewhere, they should already be properly formatted
  • If you need/want to edit a JSON file directly though, you should be aware of a few “gotchas” where the JSON syntax varies slightly from Python’s syntax
    • You can not have trailing commas at the end of a JSON structure
      • Something like [1, 2, 3,] is perfectly fine in Python, but illegal in JSON
    • JSON strings require double quotes
      • In Python you can use either double or single quotes, but JSON requires double
    • Booleans are all lowercase in JSON
      • Vs starting with a capital letter in Python
// reveal.js plugins