A.I, Data and Software Engineering

A

In part 1, we introduced advanced string, bytes manipulation in Python. This article covers some advanced python knowledge with built-in functions and other useful tools for sequence iteration, data transformation.

### Useful built-in functions

• Use `any()` to return true if any of the sequence values are true
• Use `all()` to return true only if all values are true
• Quickly find the minimum/maximum value in a sequence by using `min()` and `max()`
``````def main():
# use any() and all() to test sequences for boolean values
list1 = [1, 2, 3, 0, 5, 6]
# any will return true if any of the sequence values are true
print(any(list1))
# all will return true only if all values are true
print(all(list1))
# min and max will return minimum and maximum values in a sequence
print("min: ", min(list1))
print("max: ", max(list1))
# Use sum() to sum up all of the values in a sequence
print("sum: ", sum(list1))
if __name__ == "__main__":
main()``````

### Iterators

There are several methods to loop through elements in a sequence.

• Use `iter()` to create an iterator over collection
• Use `enumerate()` reduces code and provides a counter (index)
``````#Create a testfile.txt file that contains the following text.
This is line 1
This is line 2
This is line 3
This is line 4
This is line 5
This is line 6``````

We now can run the example using the file created.

``````# define a list of days in English and French
days = ["Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat"]
daysFr = ["Dim", "Lun", "Mar", "Mer", "Jeu", "Ven", "Sam"]
# use iter to create an iterator over a collection
i = iter(days)
print(next(i))  # Sun
print(next(i))  # Mon
print(next(i))  # Tue
# iterate using a function and a sentinel
with open("testfile.txt", "r") as fp:
print(line)
#This is line 1
#This is line 2
#This is line 3
#This is line 4
#This is line 5
#This is line 6``````

Use enumerator with `zip` to loop through combined sequences.

``````# use regular interation over the days
for m in range(len(days)):
print(m+1, days[m])
# using enumerate reduces code and provides a counter
for i, m in enumerate(days, start=1):
print(i, m)
# use zip to combine sequences
for m in zip(days, daysFr):
print(m)
for i, m in enumerate(zip(days, daysFr), start=1):
print(i, m[0], "=", m[1], "in French")
#1 Sun = Dim in French
#2 Mon = Lun in French
#3 Tue = Mar in French
#4 Wed = Mer in French
#5 Thu = Jeu in French
#6 Fri = Ven in French
#7 Sat = Sam in French``````

### Itertools

We are demonstrating a couple of infinite iterators, and these are iterators that will generate values for as long as you need them and they just never end. First is called cycle, and it does what its name implies, it cycles over a set of values.

``````import itertools
seq1 = ["Joe", "John", "Mike"]
cycle1 = itertools.cycle(seq1)
print(next(cycle1))
print(next(cycle1))
print(next(cycle1))
print(next(cycle1))
#Joe John Mike Joe``````

The next infinite iterator we’re going to create is called a count iterator. It creates a counter which defaults to zero. I can also give it a step value, which defaults to one but I’ll make it 10.

``````# use count to create a simple counter
count1 = itertools.count(100, 10)
print(next(count1)) #100
print(next(count1)) #110
print(next(count1)) #120``````

It is also useful to use `itertools` to accumulates value.  Now it defaults to addition:

``````# accumulate creates an default iterator that accumulates values
vals = [10,20,30,40,50,40,30]
acc = itertools.accumulate(vals)
print(list(acc))
#[10, 30, 60, 100, 150, 190, 220]``````

I can change the default addition function with the built-in `max` function. It will then replace all values from the index of the maximum value to the max value.

``````vals = [10,20,30,40,50,40,30]
acc = itertools.accumulate(vals, max)
print(list(acc))
#[10, 20, 30, 40, 50, 50, 50]``````

There are also many useful tools from `itertools` package to handle files. For example to read a file as chunks of lines. To achieve that, the `islice()` method from the itertools module comes into play. Also, it works as an iterator, and returns a chunk of data that consists of `n` lines. At the end of the file, the result might be shorter, and finally the call will return an empty list.

``````from itertools import islice
# define the name of the file to read from
filename = "test.txt"
# define the number of lines to read
number_of_lines = 5
with open(filename, 'r') as input_file:
lines_cache = islice(input_file, number_of_lines)
for current_line in lines_cache:
print (current_line)``````

You may be interested in the advanced python method to find object size.