MATH 4440/5440 IntroductionToSageForCrypto

327 days ago by kast5807

Welcome to Sage! 

This is a worksheet to introduce Sage and Python to undergraduates taking a first cryptography course, and not necessarily having any programming background.  Students should know the concepts of prime factorization, caesar cipher and modular arithmetic, for the purposes of examples.

Created by Katherine E. Stange

(I release this work to the public domain; feel free to use and adapt. I don't mind if you let me know it was useful, though, or provide feedback.)

 

Section 1:  Overview of Interacting with Sage

 

This is a "notebook" where you type commands and see the results.  Sage runs on the underlying programming language Python.  To get started, ask Sage what is 2+2 by clicking on the cell and hitting shift+enter.  This "evaluates" the cell.

2+2 
       
4
4

Your turn.  Find out what is $(5 \cdot 2^{10}+3)/47$.  In Python, exponents are typed like this:  2^3 = 8, and multiplication is typed like this: 2*3 = 6.

(Hint: the answer should be 109.)

 
       

An important thing to know about Sage is that it has variables which are stored throughout the notebook.  To save a value into a variable name (to "assign a value to the variable"), e.g. save the value 3 into the variable "mynumber", you do this:

mynumber = 3 
       

Sage doesn't actually print out anything in response to that, but now the name "mynumber" will return 3 if you use it.  Ask Sage, what is mynumber+3?

 
       
6
6

Sage is full of in-built number theory functions.  Here are some self-explanatory examples (please always evaluate each example cell as we work through this notebook):

is_prime(105) 
       
factor(105) 
       

It's important to know the alternate format for calling a function on an object:

105.factor() 
       

Another important thing to know is how to check equalities or inequalities.  Try these out:

3 > 7 
       
3 < 7 
       
3 == 7 
       
3+4 == 7 
       

Did you notice the double ==?  That is for "comparison" (meaning the answer is true or false, they are equal or not equal).  (You used a single = to assign a value to a variable when you defined mynumber.  Don't forget the difference.) 


Now please determine whether $\gcd(36,54)$ is larger than $19$, or not.  Write something which will evaluate to True or False.  The syntax for $\gcd$ is just gcd(36,54).  (If you need a refresher, gcd = greatest common divisor, the biggest integer dividing both numbers.)

 
       

Section 2: Figuring out Sage Functions

Oftentimes, you will want to find a function you know should exist.  There are a few really good strategies:

(1) Tab completion for functions.  If you type the beginning of the name of a function, then hit tab, the possible functions that start that way appear.

Try it now here.  Note, you can hit "escape" to get out of the menu that appears.

prime 
       

 (2) Tab completion for objects.  Once you have defined something, it has all kinds of things you can do to it.  Type the thing's name, followed by a period, and hit tab to see your options. 

Try it now here.  Note, you can hit "escape" to get out of the menu that appears.

mynumber. 
       

Now experiment with using some of the functions that appeared on mynumber, and see if you can guess what they do.

 
       

(3) Question mark.  If you have a function, and you want to see the documentation for that function, type a question mark after the function, shift-enter, and Sage will pop up info.

Try it here:

factor? 
       

File: /usr/local/sage-8.4/sage/local/lib/python2.7/site-packages/sage/arith/misc.py

Type: <type ‘function’>

Definition: factor(n, proof=None, int_=False, algorithm=’pari’, verbose=0, **kwds)

Docstring:

Returns the factorization of n. The result depends on the type of n.

If n is an integer, returns the factorization as an object of type Factorization.

If n is not an integer, n.factor(proof=proof, **kwds) gets called. See n.factor?? for more documentation in this case.

Warning

This means that applying factor to an integer result of a symbolic computation will not factor the integer, because it is considered as an element of a larger symbolic ring.

EXAMPLES:

sage: f(n)=n^2
sage: is_prime(f(3))
False
sage: factor(f(3))
9

INPUT:

  • n - an nonzero integer
  • proof - bool or None (default: None)
  • int_ - bool (default: False) whether to return answers as Python ints
  • algorithm - string
    • 'pari' - (default) use the PARI c library
    • 'kash' - use KASH computer algebra system (requires the optional kash package be installed)
    • 'magma' - use Magma (requires magma be installed)
  • verbose - integer (default: 0); PARI’s debug variable is set to this; e.g., set to 4 or 8 to see lots of output during factorization.

OUTPUT:

  • factorization of n

The qsieve and ecm commands give access to highly optimized implementations of algorithms for doing certain integer factorization problems. These implementations are not used by the generic factor command, which currently just calls PARI (note that PARI also implements sieve and ecm algorithms, but they aren’t as optimized). Thus you might consider using them instead for certain numbers.

The factorization returned is an element of the class Factorization; see Factorization?? for more details, and examples below for usage. A Factorization contains both the unit factor (+1 or -1) and a sorted list of (prime, exponent) pairs.

The factorization displays in pretty-print format but it is easy to obtain access to the (prime,exponent) pairs and the unit, to recover the number from its factorization, and even to multiply two factorizations. See examples below.

EXAMPLES:

sage: factor(500)
2^2 * 5^3
sage: factor(-20)
-1 * 2^2 * 5
sage: f=factor(-20)
sage: list(f)
[(2, 2), (5, 1)]
sage: f.unit()
-1
sage: f.value()
-20
sage: factor( -next_prime(10^2) * next_prime(10^7) )
-1 * 101 * 10000019
sage: factor(-500, algorithm='kash')      # optional - kash
-1 * 2^2 * 5^3
sage: factor(-500, algorithm='magma')     # optional - magma
-1 * 2^2 * 5^3
sage: factor(0)
Traceback (click to the left of this block for traceback)
...

                                
                            

File: /usr/local/sage-8.4/sage/local/lib/python2.7/site-packages/sage/arith/misc.py

Type: <type ‘function’>

Definition: factor(n, proof=None, int_=False, algorithm=’pari’, verbose=0, **kwds)

Docstring:

Returns the factorization of n. The result depends on the type of n.

If n is an integer, returns the factorization as an object of type Factorization.

If n is not an integer, n.factor(proof=proof, **kwds) gets called. See n.factor?? for more documentation in this case.

Warning

This means that applying factor to an integer result of a symbolic computation will not factor the integer, because it is considered as an element of a larger symbolic ring.

EXAMPLES:

sage: f(n)=n^2
sage: is_prime(f(3))
False
sage: factor(f(3))
9

INPUT:

  • n - an nonzero integer
  • proof - bool or None (default: None)
  • int_ - bool (default: False) whether to return answers as Python ints
  • algorithm - string
    • 'pari' - (default) use the PARI c library
    • 'kash' - use KASH computer algebra system (requires the optional kash package be installed)
    • 'magma' - use Magma (requires magma be installed)
  • verbose - integer (default: 0); PARI’s debug variable is set to this; e.g., set to 4 or 8 to see lots of output during factorization.

OUTPUT:

  • factorization of n

The qsieve and ecm commands give access to highly optimized implementations of algorithms for doing certain integer factorization problems. These implementations are not used by the generic factor command, which currently just calls PARI (note that PARI also implements sieve and ecm algorithms, but they aren’t as optimized). Thus you might consider using them instead for certain numbers.

The factorization returned is an element of the class Factorization; see Factorization?? for more details, and examples below for usage. A Factorization contains both the unit factor (+1 or -1) and a sorted list of (prime, exponent) pairs.

The factorization displays in pretty-print format but it is easy to obtain access to the (prime,exponent) pairs and the unit, to recover the number from its factorization, and even to multiply two factorizations. See examples below.

EXAMPLES:

sage: factor(500)
2^2 * 5^3
sage: factor(-20)
-1 * 2^2 * 5
sage: f=factor(-20)
sage: list(f)
[(2, 2), (5, 1)]
sage: f.unit()
-1
sage: f.value()
-20
sage: factor( -next_prime(10^2) * next_prime(10^7) )
-1 * 101 * 10000019
sage: factor(-500, algorithm='kash')      # optional - kash
-1 * 2^2 * 5^3
sage: factor(-500, algorithm='magma')     # optional - magma
-1 * 2^2 * 5^3
sage: factor(0)
Traceback (most recent call last):
...
ArithmeticError: factorization of 0 is not defined
sage: factor(1)
1
sage: factor(-1)
-1
sage: factor(2^(2^7)+1)
59649589127497217 * 5704689200685129054721

Sage calls PARI’s factor, which has proof False by default. Sage has a global proof flag, set to True by default (see sage.structure.proof.proof, or proof.[tab]). To override the default, call this function with proof=False.

sage: factor(3^89-1, proof=False)
2 * 179 * 1611479891519807 * 5042939439565996049162197
sage: factor(2^197 + 1)  # long time (2s)
3 * 197002597249 * 1348959352853811313 * 251951573867253012259144010843

Any object which has a factor method can be factored like this:

sage: K.<i> = QuadraticField(-1)
sage: factor(122 - 454*i)
(-3*i - 2) * (-i - 2)^3 * (i + 1)^3 * (i + 4)

To access the data in a factorization:

sage: f = factor(420); f
2^2 * 3 * 5 * 7
sage: [x for x in f]
[(2, 2), (3, 1), (5, 1), (7, 1)]
sage: [p for p,e in f]
[2, 3, 5, 7]
sage: [e for p,e in f]
[2, 1, 1, 1]
sage: [p^e for p,e in f]
[4, 3, 5, 7]

We can factor Python, numpy and gmpy2 numbers:

sage: factor(math.pi)
3.141592653589793
sage: import numpy
sage: factor(numpy.int8(30))
2 * 3 * 5
sage: import gmpy2          # optional - gmpy2
sage: factor(gmpy2.mpz(30)) # optional - gmpy2
2 * 3 * 5

(4) Google search.  Make sure to use the words "sage math" in the search if you are having trouble.  If what you are searching is really about programming, not math, then use the keyword "python" (the underlying programming language for Sage).


Okay, now that you have some strategies for figuring out how to discover Sage's inbuilt functions, use them to figure out how to compute the list of divisors of mynumber.  (Hint: the answer should be [1,3].)

 
       
[1, 3]
[1, 3]

Section 3:  Modular Arithmetic

You can do modular arithmetic in Sage, which looks like this:

Mod(23, 7) 
       

That's fine for a one-off computation.  But what if we really want to play with integers modulo 7 a lot?  We can work in the integers modulo 7 by defining the ring (number system) itself as an object.  Here's how I would do that, and assign it to the name "R":

R = IntegerModRing(7) 
       

From now on, Sage knows the name "R" refers to the integers considered modulo 7.

Now I just wrap my integers in R (by this I mean write R(3) instead of just 3 etc.) and Sage will know that I want them to be thought of as integers modulo 7.  Then I can do all kinds of math, just using those!

R(23) 
       
R(3) + R(12) 
       

I can even take the inverse of something Mod 7.

R(2)^(-1) 
       

Let's double check that last result is correct.  Multiply the answer you got by 2 (working in R) and make sure you get 1.

 
       

Section 4a:  Basic Intro to Programming -- For Loop

Sage can do programming.  The simplest thing you might want to do as far as programming is to repeat something a bunch of times.  See if you can parse the following code and guess what it will do.  Then, run it and see.

for i in range(1,10): print R(i) 
       

Now, using the same idea, make Sage print the inverse of every possible value Modulo 7.  Use a "for loop" like the last example.

 
       

Next challenge:  use a for loop to do the following for every n in the range 10 to 20:

  • define the integers modulo n
  • compute the inverse of 101 in that ring
 
       

Section 4b:  Basic Intro to Programming -- If

Here's an example of the syntax for checking with an if statement, as well as printing a result.

for i in range(1,50): if Mod(i,7) == 0: print "I found something that is divisible by 7. I found: ", i 
       

Use a for loop with an if statement to print out all the primes in the range from 1 to 30.

 
       

Now here's your next challenge:  find all the twin primes up to 1000, and print them out.  A twin prime is a prime number p such that p+2 is also prime.

 
       

Section 4c:  Basic Intro to Programming -- Using a variable

Now remember we learned we can create variables.  Create a program that will compute i(i+1)(i+2) for each integer i from 1 to 5, and assign the result to a variable called product, and then print product.  It should print out five numbers (they should be 6, 24, 60, 120 and 210).

 
       

Now copy the program above and modify it as described here.  Create a program that will compute i(i+1)(i+2) for each integer i from 1 to 5, and then print the result modulo n for each n from 1 to 5.  To accomplish this, as before, assign the product i(i+1)(i+2) to a new variable product, so that you only need to write out the formula for it once and afterward can call (meaning, reference or use) that variable as you print out all its residues mod n.  Hint:  the result, if you've done it correctly, should print 25 numbers, all 0's except one 2.  (Weird!)

 
       

Section 5: Creating Functions (and rational numbers)

You can create your own functions.  Here's an example.

def myproduct( x, y ): return x*y 
       
myproduct(2, 3) 
       

Here's a more complicated example, which computes the mediant of two fractions.  The mediant of a/b and c/d is (a+c)/(b+d).  (This is not the correct way to add fractions!  But it has its uses.)

def mediant( x, y ): a = x.numerator() b = x.denominator() c = y.numerator() d = y.denominator() return (a+c)/(b+d) 
       
mediant( 1/3, 4/5 ) 
       

Your challenge:  create a function adds_to_zero( x, y ) which returns True if x and y add to 0 and False otherwise.

 
       

Section 6:  Lists

One of the most useful things in Python is lists.  A list is a list of objects, e.g. a list of numbers.  You can manipulate lists to do all manner of things.  Here's how you define a list:

mylist = [ 1, 2, 3, 4, 5 ] 
       

Here's how you get the an element from a list.  The first element is "element 0", the second element is "element 1" and so on (weird, I know).  After running this code, modify it to pull out the element "3" instead.

mylist[0] 
       
1
1

You can even pull out a range of your list (remember, the notation is inclusive on the left and not on the right).  After running the code below, modify it to pull out the sublist [3,4,5].

mylist[1:3] 
       
[2, 3]
[2, 3]

You can define lists very compactly (avoiding making for loops).  For example:

gcdwithme = [ gcd( x, 15 ) for x in range(1,20) ] 
       
gcdwithme 
       

Side tip:  The syntax range(a,b) which we've been using actually just returns a list:  the list of all the integers starting with a and ending with b-1.  Try it:

range(1,10) 
       
[1, 2, 3, 4, 5, 6, 7, 8, 9]
[1, 2, 3, 4, 5, 6, 7, 8, 9]

Your turn.  Make a list of the values of x modulo $11$, for x in the range of 1 to 100.  Assign it to the variable name mymodlist.

 
       

Now print out the 11th term of your list.

 
       

Section 6b:  Graphing

You can graph data easily in sage by sending a list to the function list_plot.

list_plot( [ x^2 for x in range(1,300) ] ) 
       

Now draw a graph of mymodlist, the list you made in the last section.  Don't create it again; just reference the variable.

 
       

Section 7:  Lists and Strings

String are treated a lot like lists in in Python.  Here's a string and how you can pull out any one character or substring.

mystring = "DrDoolittle" print mystring[1] print mystring[2:5] 
       

Your turn!  Create a string that has a hidden word in it, and then print out that substring using the method demonstrated above.

 
       

A pretty useful trick is to look up the first occurrence of a character or entry in a string or list.

mystring.index("o") 
       

You can also make an empty list with [].  You can add things with append().

mylist = [] mylist.append(4) print(mylist) 
       

You can make an empty string with "" or ''.  You can add things to it with +=.  

mystring = "" mystring += "ooo" print(mystring) 
       

The "+=" thing also works for numbers.

mynum = 3 mynum += 4 mynum 
       

Make an empty string and use += in a loop to create the string "0123456789".  Hint: you can use str() to change a number into a string.

 
       

Now use the "index" function demonstrated above to look up the first occurence of the character "4" in your string.

 
       

One more useful thing:  the length of a string or list.

mystring = "54321" print( len(mystring) ) 
       


Section 8a:  Matrices

Here's how you can make matrices.  They don't look great when printed, so there's a command "show".

M = matrix([[1,2],[3,4]]) N = matrix([[1,0],[0,1]]) show(N) show(M) 
       

You can add matrices.

N+M 
       

Figure out how to multiply matrices.

 
       

You should think of a vector just as a matrix with only one column (or one row).  For example, to evaluate a matrix on a vector, you might do the following:

mymatrix = matrix([[2,0],[0,2]]) myvector = matrix([[1],[2]]) show(mymatrix) show(myvector) show(mymatrix*myvector) 
       

You can also define matrices over a ring of integers modulo n.  Notice how the syntax for the matrix provides Sage the information that you want it to consider the entries in the appropriate ring.  Two of these are matrices with entries mod 26.  One of them has just integer entries.  You can't tell the difference with show().  (Hint: if you really want to know the difference sometime, try parent().)

R = IntegerModRing(26) M = matrix(R, [[7,14],[20,19]]) N = matrix(R, [[1,2],[3,4]]) P = matrix([[1,2],[3,4]]) show(M) show(N) show(P) 
       

Now figure out how to compute the inverse of M.

 
       

What's the inverse of N?  (Hint: this won't work!  Why?)

 
       

Now the inverse of P?  Notice this is asking a different question than the inverse of N, because we are working in a different ring.

 
       

Now compute and print the determinants of N and P.  Why are the answers what you expect?

 
       

Section 8b:  Matrices and Matrix Plots

Another really useful graphical method is matrix_plot, which will plot data from an array.  First, I'll show you how to make a matrix (this one is 7x7 with entries in the integers, which Sage refers to as ZZ).  This matrix is just full of zeroes by default.

mymatrix = matrix(ZZ,7,7) 
       

Here's how you get one entry of your matrix:

mymatrix[0,2] 
       

Now I will populate my matrix with an addition table modulo 7.

for i in range(0,7): for j in range(0,7): mymatrix[i,j] = R(i)+R(j) 
       

First, let's print the matrix:

mymatrix 
       

And now let's "plot" it.

matrix_plot( mymatrix, colorbar = True, cmap = 'GnBu' ) 
       

Section X:  Challenges

Here's a series of challenges for you to practice the skills you learned above and learn new ones.  Read through them first and pick the ones that interest you.

Challenge 1:

Create a list of all the primes up to 100 which are 1 modulo 12.  Hint: you could use is_prime() or you could use primes().

 
       

Challenge 2:

Find all the 3-character substrings of the string "toboldlygowherenoonehasgonebefore" that begin with the character "o".

 
       

Challenge 2:

Lists can be useful for storing the results of a computation.  Use a loop to create and store all the caesar shifts of the word "fuzzy". 

 
       

Challenge 3:

Create a function that, given $n$, will return those residues modulo $n$ which are squares.

 
       

Challenge 4:

Find all the primes up to 100 which are sums of two integer squares (e.g. $5 = 1^2 + 2^2$ but 7 has no such expression).

 
       

Challenge 6:

Use matrix_plot to show, in a square grid 100x100, which pairs of integers are coprime (coprime means gcd=1).

 
       

Challenge 7:

Make Sage output a LaTeX table suitable for cutting and pasting into your homework (there are inbuilt functions for this!), showing the multiplication table for integers modulo 13.

 
       

Challenge 8:

Given the beginning of "Tale of Two Cities" by Charles Dickens, given below, obtain the version of this string containing just its alphabet characters.  Then, perform a frequency analysis.  That is, output a list whose n-th entry is the frequency of the n-th letter of the roman alphabet in the string.  Finally, graph the result.


It was the best of times,
it was the worst of times,
it was the age of wisdom,
it was the age of foolishness,
it was the epoch of belief,
it was the epoch of incredulity,
it was the season of Light,
it was the season of Darkness,
it was the spring of hope,
it was the winter of despair,

 
       

Bonus task on reading code:

Here's some code that creates all Caesar shifts of the word "fuzzy".  Modify this copy of the code to change a few things:  (1) only shift by even shifts and (2) make the alphabet exclude the letters "mnop", so the shift works on the smaller alphabet (skipping over those letters), and (3) fix the code so it won't break when we change the word "fuzzy" to "fuzz".

myword = "fuzzy" myalphabet = "abcdefghijklmnopqrstuvwxyz" allshifts = [] for shift in range(0,26): caesarshift = "" for position in range(5): letter = myword[position] newletter = myalphabet[Mod(myalphabet.index(letter)+shift,26)] caesarshift += newletter allshifts.append(caesarshift) print(allshifts) 
       
['fuzzy', 'gvaaz', 'hwbba', 'ixccb', 'jyddc', 'kzeed', 'laffe', 'mbggf',
'nchhg', 'odiih', 'pejji', 'qfkkj', 'rgllk', 'shmml', 'tinnm', 'ujoon',
'vkppo', 'wlqqp', 'xmrrq', 'ynssr', 'zotts', 'apuut', 'bqvvu', 'crwwv',
'dsxxw', 'etyyx']
['fuzzy', 'gvaaz', 'hwbba', 'ixccb', 'jyddc', 'kzeed', 'laffe', 'mbggf', 'nchhg', 'odiih', 'pejji', 'qfkkj', 'rgllk', 'shmml', 'tinnm', 'ujoon', 'vkppo', 'wlqqp', 'xmrrq', 'ynssr', 'zotts', 'apuut', 'bqvvu', 'crwwv', 'dsxxw', 'etyyx']