A.I, Data and Software Engineering

Advanced python part 2

A

In part 1, we introduced advanced string, bytes manipulation in Python. This article covers some advanced python knowledge with built-in functions and other useful tools for sequence iteration, data transformation.

Useful built-in functions

  • Use any() to return true if any of the sequence values are true
  • Use all() to return true only if all values are true
  • Quickly find the minimum/maximum value in a sequence by using min() and max()
def main():
    # use any() and all() to test sequences for boolean values
    list1 = [1, 2, 3, 0, 5, 6]
    # any will return true if any of the sequence values are true
    print(any(list1))
    # all will return true only if all values are true
    print(all(list1))
    # min and max will return minimum and maximum values in a sequence
    print("min: ", min(list1))
    print("max: ", max(list1))
    # Use sum() to sum up all of the values in a sequence
    print("sum: ", sum(list1))
if __name__ == "__main__":
    main()

Iterators

There are several methods to loop through elements in a sequence.

  • Use iter() to create an iterator over collection
  • Use enumerate() reduces code and provides a counter (index)
#Create a testfile.txt file that contains the following text.
This is line 1
This is line 2
This is line 3
This is line 4
This is line 5
This is line 6

We now can run the example using the file created.

# define a list of days in English and French
days = ["Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat"]
daysFr = ["Dim", "Lun", "Mar", "Mer", "Jeu", "Ven", "Sam"]
# use iter to create an iterator over a collection
i = iter(days)
print(next(i))  # Sun
print(next(i))  # Mon
print(next(i))  # Tue
# iterate using a function and a sentinel
with open("testfile.txt", "r") as fp:
    for line in iter(fp.readline, ''):
        print(line)
#This is line 1
#This is line 2
#This is line 3
#This is line 4
#This is line 5
#This is line 6

Use enumerator with zip to loop through combined sequences.

# use regular interation over the days
for m in range(len(days)):
    print(m+1, days[m])
# using enumerate reduces code and provides a counter
for i, m in enumerate(days, start=1):
    print(i, m)
# use zip to combine sequences
for m in zip(days, daysFr):
    print(m)
for i, m in enumerate(zip(days, daysFr), start=1):
    print(i, m[0], "=", m[1], "in French")
#1 Sun = Dim in French
#2 Mon = Lun in French
#3 Tue = Mar in French
#4 Wed = Mer in French
#5 Thu = Jeu in French
#6 Fri = Ven in French
#7 Sat = Sam in French

Itertools

We are demonstrating a couple of infinite iterators, and these are iterators that will generate values for as long as you need them and they just never end. First is called cycle, and it does what its name implies, it cycles over a set of values. 

import itertools
seq1 = ["Joe", "John", "Mike"]
cycle1 = itertools.cycle(seq1)
print(next(cycle1))
print(next(cycle1))
print(next(cycle1))
print(next(cycle1))
#Joe John Mike Joe

The next infinite iterator we’re going to create is called a count iterator. It creates a counter which defaults to zero. I can also give it a step value, which defaults to one but I’ll make it 10.

# use count to create a simple counter
count1 = itertools.count(100, 10)
print(next(count1)) #100
print(next(count1)) #110
print(next(count1)) #120

It is also useful to use itertools to accumulates value.  Now it defaults to addition:

# accumulate creates an default iterator that accumulates values
vals = [10,20,30,40,50,40,30]
acc = itertools.accumulate(vals)
print(list(acc))
#[10, 30, 60, 100, 150, 190, 220]

I can change the default addition function with the built-in max function. It will then replace all values from the index of the maximum value to the max value.

vals = [10,20,30,40,50,40,30]
acc = itertools.accumulate(vals, max)
print(list(acc))
#[10, 20, 30, 40, 50, 50, 50]

There are also many useful tools from itertools package to handle files. For example to read a file as chunks of lines. To achieve that, the islice() method from the itertools module comes into play. Also, it works as an iterator, and returns a chunk of data that consists of n lines. At the end of the file, the result might be shorter, and finally the call will return an empty list.

from itertools import islice
# define the name of the file to read from
filename = "test.txt"
# define the number of lines to read
number_of_lines = 5
with open(filename, 'r') as input_file:
    lines_cache = islice(input_file, number_of_lines)
    for current_line in lines_cache:
        print (current_line)

You may be interested in the advanced python method to find object size.

Add comment

A.I, Data and Software Engineering

PetaMinds focuses on developing the coolest topics in data science, A.I, and programming, and make them so digestible for everyone to learn and create amazing applications in a short time.

Categories