Reactive Programming ”; Previous Next Reactive programming is a programming paradigm that deals with data flows and the propagation of change. It means that when a data flow is emitted by one component, the change will be propagated to other components by reactive programming library. The propagation of change will continue until it reaches the final receiver. The difference between event-driven and reactive programming is that event-driven programming revolves around events and reactive programming revolves around data. ReactiveX or RX for reactive programming ReactiveX or Raective Extension is the most famous implementation of reactive programming. The working of ReactiveX depends upon the following two classes − Observable class This class is the source of data stream or events and it packs the incoming data so that the data can be passed from one thread to another. It will not give data until some observer subscribe to it. Observer class This class consumes the data stream emitted by observable. There can be multiple observers with observable and each observer will receive each data item that is emitted. The observer can receive three type of events by subscribing to observable − on_next() event − It implies there is an element in the data stream. on_completed() event − It implies end of emission and no more items are coming. on_error() event − It also implies end of emission but in case when an error is thrown by observable. RxPY – Python Module for Reactive Programming RxPY is a Python module which can be used for reactive programming. We need to ensure that the module is installed. The following command can be used to install the RxPY module − pip install RxPY Example Following is a Python script, which uses RxPY module and its classes Observable and Observe for reactive programming. There are basically two classes − get_strings() − for getting the strings from observer. PrintObserver() − for printing the strings from observer. It uses all three events of observer class. It also uses subscribe() class. from rx import Observable, Observer def get_strings(observer): observer.on_next(“Ram”) observer.on_next(“Mohan”) observer.on_next(“Shyam”) observer.on_completed() class PrintObserver(Observer): def on_next(self, value): print(“Received {0}”.format(value)) def on_completed(self): print(“Finished”) def on_error(self, error): print(“Error: {0}”.format(error)) source = Observable.create(get_strings) source.subscribe(PrintObserver()) Output Received Ram Received Mohan Received Shyam Finished PyFunctional library for reactive programming PyFunctionalis another Python library that can be used for reactive programming. It enables us to create functional programs using the Python programming language. It is useful because it allows us to create data pipelines by using chained functional operators. Difference between RxPY and PyFunctional Both the libraries are used for reactive programming and handle the stream in similar fashion but the main difference between both of them depends upon the handling of data. RxPY handles data and events in the system while PyFunctional is focused on transformation of data using functional programming paradigms. Installing PyFunctional Module We need to install this module before using it. It can be installed with the help of pip command as follows − pip install pyfunctional Example Following example uses the PyFunctional module and its seq class which act as the stream object with which we can iterate and manipulate. In this program, it maps the sequence by using the lamda function that doubles every value, then filters the value where x is greater than 4 and finally it reduces the sequence into a sum of all the remaining values. from functional import seq result = seq(1,2,3).map(lambda x: x*2).filter(lambda x: x > 4).reduce(lambda x, y: x + y) print (“Result: {}”.format(result)) Output Result: 6 Print Page Previous Next Advertisements ”;
Category: concurrency In Python
Benchmarking & Profiling
Benchmarking and Profiling ”; Previous Next In this chapter, we will learn how benchmarking and profiling help in addressing performance issues. Suppose we had written a code and it is giving the desired result too but what if we want to run this code a bit faster because the needs have changed. In this case, we need to find out what parts of our code are slowing down the entire program. In this case, benchmarking and profiling can be useful. What is Benchmarking? Benchmarking aims at evaluating something by comparison with a standard. However, the question that arises here is that what would be the benchmarking and why we need it in case of software programming. Benchmarking the code means how fast the code is executing and where the bottleneck is. One major reason for benchmarking is that it optimizes the code. How does benchmarking work? If we talk about the working of benchmarking, we need to start by benchmarking the whole program as one current state then we can combine micro benchmarks and then decompose a program into smaller programs. In order to find the bottlenecks within our program and optimize it. In other words, we can understand it as breaking the big and hard problem into series of smaller and a bit easier problems for optimizing them. Python module for benchmarking In Python, we have a by default module for benchmarking which is called timeit. With the help of the timeit module, we can measure the performance of small bit of Python code within our main program. Example In the following Python script, we are importing the timeit module, which further measures the time taken to execute two functions – functionA and functionB − import timeit import time def functionA(): print(“Function A starts the execution:”) print(“Function A completes the execution:”) def functionB(): print(“Function B starts the execution”) print(“Function B completes the execution”) start_time = timeit.default_timer() functionA() print(timeit.default_timer() – start_time) start_time = timeit.default_timer() functionB() print(timeit.default_timer() – start_time) After running the above script, we will get the execution time of both the functions as shown below. Output Function A starts the execution: Function A completes the execution: 0.0014599495514175942 Function B starts the execution Function B completes the execution 0.0017024724827479076 Writing our own timer using the decorator function In Python, we can create our own timer, which will act just like the timeit module. It can be done with the help of the decorator function. Following is an example of the custom timer − import random import time def timer_func(func): def function_timer(*args, **kwargs): start = time.time() value = func(*args, **kwargs) end = time.time() runtime = end – start msg = “{func} took {time} seconds to complete its execution.” print(msg.format(func = func.__name__,time = runtime)) return value return function_timer @timer_func def Myfunction(): for x in range(5): sleep_time = random.choice(range(1,3)) time.sleep(sleep_time) if __name__ == ”__main__”: Myfunction() The above python script helps in importing random time modules. We have created the timer_func() decorator function. This has the function_timer() function inside it. Now, the nested function will grab the time before calling the passed in function. Then it waits for the function to return and grabs the end time. In this way, we can finally make python script print the execution time. The script will generate the output as shown below. Output Myfunction took 8.000457763671875 seconds to complete its execution. What is profiling? Sometimes the programmer wants to measure some attributes like the use of memory, time complexity or usage of particular instructions about the programs to measure the real capability of that program. Such kind of measuring about program is called profiling. Profiling uses dynamic program analysis to do such measuring. In the subsequent sections, we will learn about the different Python Modules for Profiling. cProfile – the inbuilt module cProfile is a Python built-in module for profiling. The module is a C-extension with reasonable overhead that makes it suitable for profiling long-running programs. After running it, it logs all the functions and execution times. It is very powerful but sometimes a bit difficult to interpret and act on. In the following example, we are using cProfile on the code below − Example def increment_global(): global x x += 1 def taskofThread(lock): for _ in range(50000): lock.acquire() increment_global() lock.release() def main(): global x x = 0 lock = threading.Lock() t1 = threading.Thread(target=taskofThread, args=(lock,)) t2 = threading.Thread(target= taskofThread, args=(lock,)) t1.start() t2.start() t1.join() t2.join() if __name__ == “__main__”: for i in range(5): main() print(“x = {1} after Iteration {0}”.format(i,x)) The above code is saved in the thread_increment.py file. Now, execute the code with cProfile on the command line as follows − (base) D:ProgramData>python -m cProfile thread_increment.py x = 100000 after Iteration 0 x = 100000 after Iteration 1 x = 100000 after Iteration 2 x = 100000 after Iteration 3 x = 100000 after Iteration 4 3577 function calls (3522 primitive calls) in 1.688 seconds Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 5 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap>:103(release) 5 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap>:143(__init__) 5 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap>:147(__enter__) … … … … From the above output, it is clear that cProfile prints out all the 3577 functions called, with the time spent in each and the number of times they have been called. Followings are the columns we got in output − ncalls − It is the number of calls made. tottime − It is the total time spent in the given function. percall − It refers to the quotient of tottime divided by ncalls. cumtime − It is the cumulative time spent in this and all subfunctions. It is even accurate for recursive functions. percall − It is the quotient of cumtime divided by primitive calls. filename:lineno(function) − It basically provides the respective data of each function. Print Page Previous Next Advertisements ”;
Event-Driven Programming
Event-Driven Programming ”; Previous Next Event-driven programming focuses on events. Eventually, the flow of program depends upon events. Until now, we were dealing with either sequential or parallel execution model but the model having the concept of event-driven programming is called asynchronous model. Event-driven programming depends upon an event loop that is always listening for the new incoming events. The working of event-driven programming is dependent upon events. Once an event loops, then events decide what to execute and in what order. Following flowchart will help you understand how this works − Python Module – Asyncio Asyncio module was added in Python 3.4 and it provides infrastructure for writing single-threaded concurrent code using co-routines. Following are the different concepts used by the Asyncio module − The event loop Event-loop is a functionality to handle all the events in a computational code. It acts round the way during the execution of whole program and keeps track of the incoming and execution of events. The Asyncio module allows a single event loop per process. Followings are some methods provided by Asyncio module to manage an event loop − loop = get_event_loop() − This method will provide the event loop for the current context. loop.call_later(time_delay,callback,argument) − This method arranges for the callback that is to be called after the given time_delay seconds. loop.call_soon(callback,argument) − This method arranges for a callback that is to be called as soon as possible. The callback is called after call_soon() returns and when the control returns to the event loop. loop.time() − This method is used to return the current time according to the event loop’s internal clock. asyncio.set_event_loop() − This method will set the event loop for the current context to the loop. asyncio.new_event_loop() − This method will create and return a new event loop object. loop.run_forever() − This method will run until stop() method is called. Example The following example of event loop helps in printing hello world by using the get_event_loop() method. This example is taken from the Python official docs. import asyncio def hello_world(loop): print(”Hello World”) loop.stop() loop = asyncio.get_event_loop() loop.call_soon(hello_world, loop) loop.run_forever() loop.close() Output Hello World Futures This is compatible with the concurrent.futures.Future class that represents a computation that has not been accomplished. There are following differences between asyncio.futures.Future and concurrent.futures.Future − result() and exception() methods do not take a timeout argument and raise an exception when the future isn’t done yet. Callbacks registered with add_done_callback() are always called via the event loop’s call_soon(). asyncio.futures.Future class is not compatible with the wait() and as_completed() functions in the concurrent.futures package. Example The following is an example that will help you understand how to use asyncio.futures.future class. import asyncio async def Myoperation(future): await asyncio.sleep(2) future.set_result(”Future Completed”) loop = asyncio.get_event_loop() future = asyncio.Future() asyncio.ensure_future(Myoperation(future)) try: loop.run_until_complete(future) print(future.result()) finally: loop.close() Output Future Completed Coroutines The concept of coroutines in Asyncio is similar to the concept of standard Thread object under threading module. This is the generalization of the subroutine concept. A coroutine can be suspended during the execution so that it waits for the external processing and returns from the point at which it had stopped when the external processing was done. The following two ways help us in implementing coroutines − async def function() This is a method for implementation of coroutines under Asyncio module. Following is a Python script for the same − import asyncio async def Myoperation(): print(“First Coroutine”) loop = asyncio.get_event_loop() try: loop.run_until_complete(Myoperation()) finally: loop.close() Output First Coroutine @asyncio.coroutine decorator Another method for implementation of coroutines is to utilize generators with the @asyncio.coroutine decorator. Following is a Python script for the same − import asyncio @asyncio.coroutine def Myoperation(): print(“First Coroutine”) loop = asyncio.get_event_loop() try: loop.run_until_complete(Myoperation()) finally: loop.close() Output First Coroutine Tasks This subclass of Asyncio module is responsible for execution of coroutines within an event loop in parallel manner. Following Python script is an example of processing some tasks in parallel. import asyncio import time async def Task_ex(n): time.sleep(1) print(“Processing {}”.format(n)) async def Generator_task(): for i in range(10): asyncio.ensure_future(Task_ex(i)) int(“Tasks Completed”) asyncio.sleep(2) loop = asyncio.get_event_loop() loop.run_until_complete(Generator_task()) loop.close() Output Tasks Completed Processing 0 Processing 1 Processing 2 Processing 3 Processing 4 Processing 5 Processing 6 Processing 7 Processing 8 Processing 9 Transports Asyncio module provides transport classes for implementing various types of communication. These classes are not thread safe and always paired with a protocol instance after establishment of communication channel. Following are distinct types of transports inherited from the BaseTransport − ReadTransport − This is an interface for read-only transports. WriteTransport − This is an interface for write-only transports. DatagramTransport − This is an interface for sending the data. BaseSubprocessTransport − Similar to BaseTransport class. Followings are five distinct methods of BaseTransport class that are subsequently transient across the four transport types − close() − It closes the transport. is_closing() − This method will return true if the transport is closing or is already closed.transports. get_extra_info(name, default = none) − This will give us some extra information about transport. get_protocol() − This method will return the current protocol. Protocols Asyncio module provides base classes that you can subclass to implement your network protocols. Those classes are used in conjunction with transports; the protocol parses incoming data and asks for the writing of outgoing data, while the transport is responsible for the actual I/O and buffering. Following are three classes of Protocol − Protocol − This is the base class for implementing streaming protocols for use with TCP and SSL transports. DatagramProtocol − This is the base class for implementing datagram protocols for use with UDP transports.. SubprocessProtocol − This is the base class for implementing protocols communicating with child processes through a set of unidirectional pipes. Print Page Previous Next Advertisements ”;
Introduction
Concurrency in Python – Introduction ”; Previous Next In this chapter, we will understand the concept of concurrency in Python and learn about the different threads and processes. What is Concurrency? In simple words, concurrency is the occurrence of two or more events at the same time. Concurrency is a natural phenomenon because many events occur simultaneously at any given time. In terms of programming, concurrency is when two tasks overlap in execution. With concurrent programming, the performance of our applications and software systems can be improved because we can concurrently deal with the requests rather than waiting for a previous one to be completed. Historical Review of Concurrency Following points will give us the brief historical review of concurrency − From the concept of railroads Concurrency is closely related with the concept of railroads. With the railroads, there was a need to handle multiple trains on the same railroad system in such a way that every train would get to its destination safely. Concurrent computing in academia The interest in computer science concurrency began with the research paper published by Edsger W. Dijkstra in 1965. In this paper, he identified and solved the problem of mutual exclusion, the property of concurrency control. High-level concurrency primitives In recent times, programmers are getting improved concurrent solutions because of the introduction of high-level concurrency primitives. Improved concurrency with programming languages Programming languages such as Google’s Golang, Rust and Python have made incredible developments in areas which help us get better concurrent solutions. What is thread & multithreading? Thread is the smallest unit of execution that can be performed in an operating system. It is not itself a program but runs within a program. In other words, threads are not independent of one other. Each thread shares code section, data section, etc. with other threads. They are also known as lightweight processes. A thread consists of the following components − Program counter which consist of the address of the next executable instruction Stack Set of registers A unique id Multithreading, on the other hand, is the ability of a CPU to manage the use of operating system by executing multiple threads concurrently. The main idea of multithreading is to achieve parallelism by dividing a process into multiple threads. The concept of multithreading can be understood with the help of the following example. Example Suppose we are running a particular process wherein we open MS Word to type content into it. One thread will be assigned to open MS Word and another thread will be required to type content in it. And now, if we want to edit the existing then another thread will be required to do the editing task and so on. What is process & multiprocessing? Aprocessis defined as an entity, which represents the basic unit of work to be implemented in the system. To put it in simple terms, we write our computer programs in a text file and when we execute this program, it becomes a process that performs all the tasks mentioned in the program. During the process life cycle, it passes through different stages – Start, Ready, Running, Waiting and Terminating. Following diagram shows the different stages of a process − A process can have only one thread, called primary thread, or multiple threads having their own set of registers, program counter and stack. Following diagram will show us the difference − Multiprocessing, on the other hand, is the use of two or more CPUs units within a single computer system. Our primary goal is to get the full potential from our hardware. To achieve this, we need to utilize full number of CPU cores available in our computer system. Multiprocessing is the best approach to do so. Python is one of the most popular programming languages. Followings are some reasons that make it suitable for concurrent applications − Syntactic sugar Syntactic sugar is syntax within a programming language that is designed to make things easier to read or to express. It makes the language “sweeter” for human use: things can be expressed more clearly, more concisely, or in an alternative style based on preference. Python comes with Magic methods, which can be defined to act on objects. These Magic methods are used as syntactic sugar and bound to more easy-to-understand keywords. Large Community Python language has witnessed a massive adoption rate amongst data scientists and mathematicians, working in the field of AI, machine learning, deep learning and quantitative analysis. Useful APIs for concurrent programming Python 2 and 3 have large number of APIs dedicated for parallel/concurrent programming. Most popular of them are threading, concurrent.features, multiprocessing, asyncio, gevent and greenlets, etc. Limitations of Python in implementing concurrent applications Python comes with a limitation for concurrent applications. This limitation is called GIL (Global Interpreter Lock) is present within Python. GIL never allows us to utilize multiple cores of CPU and hence we can say that there are no true threads in Python. We can understand the concept of GIL as follows − GIL (Global Interpreter Lock) It is one of the most controversial topics in the Python world. In CPython, GIL is the mutex – the mutual exclusion lock, which makes things thread safe. In other words, we can say that GIL prevents multiple threads from executing Python code in parallel. The lock can be held by only one thread at a time and if we want to execute a thread then it must acquire the lock first. The diagram shown below will help you understand the working of GIL. However, there are some libraries and implementations in Python such as Numpy, Jpython and IronPytbhon. These libraries work without any interaction with GIL. Print Page Previous Next Advertisements ”;
Home
Concurrency in Python Tutorial PDF Version Quick Guide Resources Job Search Discussion Concurrency, natural phenomena, is the happening of two or more events at the same time. It is a challenging task for the professionals to create concurrent applications and get the most out of computer hardware. Audience This tutorial will be useful for graduates, postgraduates, and research students who either have an interest in this subject or have this subject as a part of their curriculum. The reader can be a beginner or an advanced learner. Prerequisites The reader must have basic knowledge about concepts such as Concurrency, Multiprocessing, Threads, and Process etc. of Operating System. He/she should also be aware about basic terminologies used in OS along with Python programming concepts. Print Page Previous Next Advertisements ”;
Pool of Threads
Concurrency in Python – Pool of Threads ”; Previous Next Suppose we had to create a large number of threads for our multithreaded tasks. It would be computationally most expensive as there can be many performance issues, due to too many threads. A major issue could be in the throughput getting limited. We can solve this problem by creating a pool of threads. A thread pool may be defined as the group of pre-instantiated and idle threads, which stand ready to be given work. Creating thread pool is preferred over instantiating new threads for every task when we need to do large number of tasks. A thread pool can manage concurrent execution of large number of threads as follows − If a thread in a thread pool completes its execution then that thread can be reused. If a thread is terminated, another thread will be created to replace that thread. Python Module – Concurrent.futures Python standard library includes the concurrent.futures module. This module was added in Python 3.2 for providing the developers a high-level interface for launching asynchronous tasks. It is an abstraction layer on the top of Python’s threading and multiprocessing modules for providing the interface for running the tasks using pool of thread or processes. In our subsequent sections, we will learn about the different classes of the concurrent.futures module. Executor Class Executoris an abstract class of the concurrent.futures Python module. It cannot be used directly and we need to use one of the following concrete subclasses − ThreadPoolExecutor ProcessPoolExecutor ThreadPoolExecutor – A Concrete Subclass It is one of the concrete subclasses of the Executor class. The subclass uses multi-threading and we get a pool of thread for submitting the tasks. This pool assigns tasks to the available threads and schedules them to run. How to create a ThreadPoolExecutor? With the help of concurrent.futures module and its concrete subclass Executor, we can easily create a pool of threads. For this, we need to construct a ThreadPoolExecutor with the number of threads we want in the pool. By default, the number is 5. Then we can submit a task to the thread pool. When we submit() a task, we get back a Future. The Future object has a method called done(), which tells if the future has resolved. With this, a value has been set for that particular future object. When a task finishes, the thread pool executor sets the value to the future object. Example from concurrent.futures import ThreadPoolExecutor from time import sleep def task(message): sleep(2) return message def main(): executor = ThreadPoolExecutor(5) future = executor.submit(task, (“Completed”)) print(future.done()) sleep(2) print(future.done()) print(future.result()) if __name__ == ”__main__”: main() Output False True Completed In the above example, a ThreadPoolExecutor has been constructed with 5 threads. Then a task, which will wait for 2 seconds before giving the message, is submitted to the thread pool executor. As seen from the output, the task does not complete until 2 seconds, so the first call to done() will return False. After 2 seconds, the task is done and we get the result of the future by calling the result() method on it. Instantiating ThreadPoolExecutor – Context Manager Another way to instantiate ThreadPoolExecutor is with the help of context manager. It works similar to the method used in the above example. The main advantage of using context manager is that it looks syntactically good. The instantiation can be done with the help of the following code − with ThreadPoolExecutor(max_workers = 5) as executor Example The following example is borrowed from the Python docs. In this example, first of all the concurrent.futures module has to be imported. Then a function named load_url() is created which will load the requested url. The function then creates ThreadPoolExecutor with the 5 threads in the pool. The ThreadPoolExecutor has been utilized as context manager. We can get the result of the future by calling the result() method on it. import concurrent.futures import urllib.request URLS = [”http://www.foxnews.com/”, ”http://www.cnn.com/”, ”http://europe.wsj.com/”, ”http://www.bbc.co.uk/”, ”http://some-made-up-domain.com/”] def load_url(url, timeout): with urllib.request.urlopen(url, timeout = timeout) as conn: return conn.read() with concurrent.futures.ThreadPoolExecutor(max_workers = 5) as executor: future_to_url = {executor.submit(load_url, url, 60): url for url in URLS} for future in concurrent.futures.as_completed(future_to_url): url = future_to_url[future] try: data = future.result() except Exception as exc: print(”%r generated an exception: %s” % (url, exc)) else: print(”%r page is %d bytes” % (url, len(data))) Output Following would be the output of the above Python script − ”http://some-made-up-domain.com/” generated an exception: <urlopen error [Errno 11004] getaddrinfo failed> ”http://www.foxnews.com/” page is 229313 bytes ”http://www.cnn.com/” page is 168933 bytes ”http://www.bbc.co.uk/” page is 283893 bytes ”http://europe.wsj.com/” page is 938109 bytes Use of Executor.map() function The Python map() function is widely used in a number of tasks. One such task is to apply a certain function to every element within iterables. Similarly, we can map all the elements of an iterator to a function and submit these as independent jobs to out ThreadPoolExecutor. Consider the following example of Python script to understand how the function works. Example In this example below, the map function is used to apply the square() function to every value in the values array. from concurrent.futures import ThreadPoolExecutor from concurrent.futures import as_completed values = [2,3,4,5] def square(n): return n * n def main(): with ThreadPoolExecutor(max_workers = 3) as executor: results = executor.map(square, values) for result in results: print(result) if __name__ == ”__main__”: main() Output The above Python script generates the following output − 4 9 16 25 Print Page Previous Next Advertisements ”;
Threads Intercommunication
Threads Intercommunication ”; Previous Next In real life, if a team of people is working on a common task then there should be communication between them for finishing the task properly. The same analogy is applicable to threads also. In programming, to reduce the ideal time of the processor we create multiple threads and assign different sub tasks to every thread. Hence, there must be a communication facility and they should interact with each other to finish the job in a synchronized manner. Consider the following important points related to thread intercommunication − No performance gain − If we cannot achieve proper communication between threads and processes then the performance gains from concurrency and parallelism is of no use. Accomplish task properly − Without proper intercommunication mechanism between threads, the assigned task cannot be completed properly. More efficient than inter-process communication − Inter-thread communication is more efficient and easy to use than inter-process communication because all threads within a process share same address space and they need not use shared memory. Python data structures for thread-safe communication Multithreaded code comes up with a problem of passing information from one thread to another thread. The standard communication primitives do not solve this issue. Hence, we need to implement our own composite object in order to share objects between threads to make the communication thread-safe. Following are a few data structures, which provide thread-safe communication after making some changes in them − Sets For using set data structure in a thread-safe manner, we need to extend the set class to implement our own locking mechanism. Example Here is a Python example of extending the class − class extend_class(set): def __init__(self, *args, **kwargs): self._lock = Lock() super(extend_class, self).__init__(*args, **kwargs) def add(self, elem): self._lock.acquire() try: super(extend_class, self).add(elem) finally: self._lock.release() def delete(self, elem): self._lock.acquire() try: super(extend_class, self).delete(elem) finally: self._lock.release() In the above example, a class object named extend_class has been defined which is further inherited from the Python set class. A lock object is created within the constructor of this class. Now, there are two functions – add() and delete(). These functions are defined and are thread-safe. They both rely on the super class functionality with one key exception. Decorator This is another key method for thread-safe communication is the use of decorators. Example Consider a Python example that shows how to use decorators &mminus; def lock_decorator(method): def new_deco_method(self, *args, **kwargs): with self._lock: return method(self, *args, **kwargs) return new_deco_method class Decorator_class(set): def __init__(self, *args, **kwargs): self._lock = Lock() super(Decorator_class, self).__init__(*args, **kwargs) @lock_decorator def add(self, *args, **kwargs): return super(Decorator_class, self).add(elem) @lock_decorator def delete(self, *args, **kwargs): return super(Decorator_class, self).delete(elem) In the above example, a decorator method named lock_decorator has been defined which is further inherited from the Python method class. Then a lock object is created within the constructor of this class. Now, there are two functions – add() and delete(). These functions are defined and are thread-safe. They both rely on super class functionality with one key exception. Lists The list data structure is thread-safe, quick as well as easy structure for temporary, in-memory storage. In Cpython, the GIL protects against concurrent access to them. As we came to know that lists are thread-safe but what about the data lying in them. Actually, the list’s data is not protected. For example, L.append(x) is not guarantee to return the expected result if another thread is trying to do the same thing. This is because, although append() is an atomic operation and thread-safe but the other thread is trying to modify the list’s data in concurrent fashion hence we can see the side effects of race conditions on the output. To resolve this kind of issue and safely modify the data, we must implement a proper locking mechanism, which further ensures that multiple threads cannot potentially run into race conditions. To implement proper locking mechanism, we can extend the class as we did in the previous examples. Some other atomic operations on lists are as follows − L.append(x) L1.extend(L2) x = L[i] x = L.pop() L1[i:j] = L2 L.sort() x = y x.field = y D[x] = y D1.update(D2) D.keys() Here − L,L1,L2 all are lists D,D1,D2 are dicts x,y are objects i, j are ints Queues If the list’s data is not protected, we might have to face the consequences. We may get or delete wrong data item, of race conditions. That is why it is recommended to use the queue data structure. A real-world example of queue can be a single-lane one-way road, where the vehicle enters first, exits first. More real-world examples can be seen of the queues at the ticket windows and bus-stops. Queues are by default, thread-safe data structure and we need not worry about implementing complex locking mechanism. Python provides us the module to use different types of queues in our application. Types of Queues In this section, we will earn about the different types of queues. Python provides three options of queues to use from the <queue> module − Normal Queues (FIFO, First in First out) LIFO, Last in First Out Priority We will learn about the different queues in the subsequent sections. Normal Queues (FIFO, First in First out) It is most commonly used queue implementations offered by Python. In this queuing mechanism whosoever will come first, will get the service first. FIFO is also called normal queues. FIFO queues can be represented as follows − Python Implementation of FIFO Queue In python, FIFO queue can be implemented with single thread as well as multithreads. FIFO queue with single thread For implementing FIFO queue with single thread, the Queue class will implement a basic first-in, first-out container. Elements will be added to one “end” of the sequence using put(), and removed from the other end using get(). Example Following is a Python program for implementation of FIFO queue with single thread − import queue q = queue.Queue() for i in range(8): q.put(“item-” + str(i)) while not q.empty(): print (q.get(), end = ” “) Output item-0 item-1
Concurrency vs Parallelism
Concurrency vs Parallelism ”; Previous Next Both concurrency and parallelism are used in relation to multithreaded programs but there is a lot of confusion about the similarity and difference between them. The big question in this regard: is concurrency parallelism or not? Although both the terms appear quite similar but the answer to the above question is NO, concurrency and parallelism are not same. Now, if they are not same then what is the basic difference between them? In simple terms, concurrency deals with managing the access to shared state from different threads and on the other side, parallelism deals with utilizing multiple CPUs or its cores to improve the performance of hardware. Concurrency in Detail Concurrency is when two tasks overlap in execution. It could be a situation where an application is progressing on more than one task at the same time. We can understand it diagrammatically; multiple tasks are making progress at the same time, as follows − Levels of Concurrency In this section, we will discuss the three important levels of concurrency in terms of programming − Low-Level Concurrency In this level of concurrency, there is explicit use of atomic operations. We cannot use such kind of concurrency for application building, as it is very error-prone and difficult to debug. Even Python does not support such kind of concurrency. Mid-Level Concurrency In this concurrency, there is no use of explicit atomic operations. It uses the explicit locks. Python and other programming languages support such kind of concurrency. Mostly application programmers use this concurrency. High-Level Concurrency In this concurrency, neither explicit atomic operations nor explicit locks are used. Python has concurrent.futures module to support such kind of concurrency. Properties of Concurrent Systems For a program or concurrent system to be correct, some properties must be satisfied by it. Properties related to the termination of system are as follows − Correctness property The correctness property means that the program or the system must provide the desired correct answer. To keep it simple, we can say that the system must map the starting program state to final state correctly. Safety property The safety property means that the program or the system must remain in a “good” or “safe” state and never does anything “bad”. Liveness property This property means that a program or system must “make progress” and it would reach at some desirable state. Actors of concurrent systems This is one common property of concurrent system in which there can be multiple processes and threads, which run at the same time to make progress on their own tasks. These processes and threads are called actors of the concurrent system. Resources of Concurrent Systems The actors must utilize the resources such as memory, disk, printer etc. in order to perform their tasks. Certain set of rules Every concurrent system must possess a set of rules to define the kind of tasks to be performed by the actors and the timing for each. The tasks could be acquiring of locks, memory sharing, modifying the state, etc. Barriers of Concurrent Systems While implementing concurrent systems, the programmer must take into consideration the following two important issues, which can be the barriers of concurrent systems − Sharing of data An important issue while implementing the concurrent systems is the sharing of data among multiple threads or processes. Actually, the programmer must ensure that locks protect the shared data so that all the accesses to it are serialized and only one thread or process can access the shared data at a time. In case, when multiple threads or processes are all trying to access the same shared data then not all but at least one of them would be blocked and would remain idle. In other words, we can say that we would be able to use only one process or thread at a time when lock is in force. There can be some simple solutions to remove the above-mentioned barriers − Data Sharing Restriction The simplest solution is not to share any mutable data. In this case, we need not to use explicit locking and the barrier of concurrency due to mutual data would be solved. Data Structure Assistance Many times the concurrent processes need to access the same data at the same time. Another solution, than using of explicit locks, is to use a data structure that supports concurrent access. For example, we can use the queue module, which provides thread-safe queues. We can also use multiprocessing.JoinableQueue classes for multiprocessing-based concurrency. Immutable Data Transfer Sometimes, the data structure that we are using, say concurrency queue, is not suitable then we can pass the immutable data without locking it. Mutable Data Transfer In continuation of the above solution, suppose if it is required to pass only mutable data, rather than immutable data, then we can pass mutable data that is read only. Sharing of I/O Resources Another important issue in implementing concurrent systems is the use of I/O resources by threads or processes. The problem arises when one thread or process is using the I/O for such a long time and other is sitting idle. We can see such kind of barrier while working with an I/O heavy application. It can be understood with the help of an example, the requesting of pages from web browser. It is a heavy application. Here, if the rate at which the data is requested is slower than the rate at which it is consumed then we have I/O barrier in our concurrent system. The following Python script is for requesting a web page and getting the time our network took to get the requested page − import urllib.request import time ts = time.time() req = urllib.request.urlopen(”https://www.tutorialspoint.com”) pageHtml = req.read() te = time.time() print(“Page Fetching Time : {} Seconds”.format (te-ts)) After executing the above script, we can get the page fetching time as shown below. Output Page Fetching Time: 1.0991398811340332 Seconds We can see that the time to fetch the page is more
Threads
Concurrency in Python – Threads ”; Previous Next In general, as we know that thread is a very thin twisted string usually of the cotton or silk fabric and used for sewing clothes and such. The same term thread is also used in the world of computer programming. Now, how do we relate the thread used for sewing clothes and the thread used for computer programming? The roles performed by the two threads is similar here. In clothes, thread hold the cloth together and on the other side, in computer programming, thread hold the computer program and allow the program to execute sequential actions or many actions at once. Thread is the smallest unit of execution in an operating system. It is not in itself a program but runs within a program. In other words, threads are not independent of one other and share code section, data section, etc. with other threads. These threads are also known as lightweight processes. States of Thread To understand the functionality of threads in depth, we need to learn about the lifecycle of the threads or the different thread states. Typically, a thread can exist in five distinct states. The different states are shown below − New Thread A new thread begins its life cycle in the new state. However, at this stage, it has not yet started and it has not been allocated any resources. We can say that it is just an instance of an object. Runnable As the newly born thread is started, the thread becomes runnable i.e. waiting to run. In this state, it has all the resources but still task scheduler have not scheduled it to run. Running In this state, the thread makes progress and executes the task, which has been chosen by task scheduler to run. Now, the thread can go to either the dead state or the non-runnable/ waiting state. Non-running/waiting In this state, the thread is paused because it is either waiting for the response of some I/O request or waiting for the completion of the execution of other thread. Dead A runnable thread enters the terminated state when it completes its task or otherwise terminates. The following diagram shows the complete life cycle of a thread − Types of Thread In this section, we will see the different types of thread. The types are described below − User Level Threads These are user-managed threads. In this case, the thread management kernel is not aware of the existence of threads. The thread library contains code for creating and destroying threads, for passing message and data between threads, for scheduling thread execution and for saving and restoring thread contexts. The application starts with a single thread. The examples of user level threads are − Java threads POSIX threads Advantages of User level Threads Following are the different advantages of user level threads − Thread switching does not require Kernel mode privileges. User level thread can run on any operating system. Scheduling can be application specific in the user level thread. User level threads are fast to create and manage. Disadvantages of User level Threads Following are the different disadvantages of user level threads − In a typical operating system, most system calls are blocking. Multithreaded application cannot take advantage of multiprocessing. Kernel Level Threads Operating System managed threads act on kernel, which is an operating system core. In this case, the Kernel does thread management. There is no thread management code in the application area. Kernel threads are supported directly by the operating system. Any application can be programmed to be multithreaded. All of the threads within an application are supported within a single process. The Kernel maintains context information for the process as a whole and for individual threads within the process. Scheduling by the Kernel is done on a thread basis. The Kernel performs thread creation, scheduling and management in Kernel space. Kernel threads are generally slower to create and manage than the user threads. The examples of kernel level threads are Windows, Solaris. Advantages of Kernel Level Threads Following are the different advantages of kernel level threads − Kernel can simultaneously schedule multiple threads from the same process on multiple processes. If one thread in a process is blocked, the Kernel can schedule another thread of the same process. Kernel routines themselves can be multithreaded. Disadvantages of Kernel Level Threads Kernel threads are generally slower to create and manage than the user threads. Transfer of control from one thread to another within the same process requires a mode switch to the Kernel. Thread Control Block – TCB Thread Control Block (TCB) may be defined as the data structure in the kernel of operating system that mainly contains information about thread. Thread-specific information stored in TCB would highlight some important information about each process. Consider the following points related to the threads contained in TCB − Thread identification − It is the unique thread id (tid) assigned to every new thread. Thread state − It contains the information related to the state (Running, Runnable, Non-Running, Dead) of the thread. Program Counter (PC) − It points to the current program instruction of the thread. Register set − It contains the thread’s register values assigned to them for computations. Stack Pointer − It points to the thread’s stack in the process. It contains the local variables under thread’s scope. Pointer to PCB − It contains the pointer to the process that created that thread. Relation between process & thread In multithreading, process and thread are two very closely related terms having the same goal to make computer able to do more than one thing at a time. A process can contain one or more threads but on the contrary, thread cannot contain a process. However, they both remain the two basic units of execution. A program, executing a series of instructions, initiates process and thread both. The following table shows the comparison between process and thread − Process Thread Process is heavy
Implementation of Threads
Implementation of Threads ”; Previous Next In this chapter, we will learn how to implement threads in Python. Python Module for Thread Implementation Python threads are sometimes called lightweight processes because threads occupy much less memory than processes. Threads allow performing multiple tasks at once. In Python, we have the following two modules that implement threads in a program − <_thread>module <threading>module The main difference between these two modules is that <_thread> module treats a thread as a function whereas, the <threading> module treats every thread as an object and implements it in an object oriented way. Moreover, the <_thread>module is effective in low level threading and has fewer capabilities than the <threading> module. <_thread> module In the earlier version of Python, we had the <thread> module but it has been considered as “deprecated” for quite a long time. Users have been encouraged to use the <threading> module instead. Therefore, in Python 3 the module “thread” is not available anymore. It has been renamed to “<_thread>” for backwards incompatibilities in Python3. To generate new thread with the help of the <_thread> module, we need to call the start_new_thread method of it. The working of this method can be understood with the help of following syntax − _thread.start_new_thread ( function, args[, kwargs] ) Here − args is a tuple of arguments kwargs is an optional dictionary of keyword arguments If we want to call function without passing an argument then we need to use an empty tuple of arguments in args. This method call returns immediately, the child thread starts, and calls function with the passed list, if any, of args. The thread terminates as and when the function returns. Example Following is an example for generating new thread by using the <_thread> module. We are using the start_new_thread() method here. import _thread import time def print_time( threadName, delay): count = 0 while count < 5: time.sleep(delay) count += 1 print (“%s: %s” % ( threadName, time.ctime(time.time()) )) try: _thread.start_new_thread( print_time, (“Thread-1”, 2, ) ) _thread.start_new_thread( print_time, (“Thread-2”, 4, ) ) except: print (“Error: unable to start thread”) while 1: pass Output The following output will help us understand the generation of new threads bwith the help of the <_thread> module. Thread-1: Mon Apr 23 10:03:33 2018 Thread-2: Mon Apr 23 10:03:35 2018 Thread-1: Mon Apr 23 10:03:35 2018 Thread-1: Mon Apr 23 10:03:37 2018 Thread-2: Mon Apr 23 10:03:39 2018 Thread-1: Mon Apr 23 10:03:39 2018 Thread-1: Mon Apr 23 10:03:41 2018 Thread-2: Mon Apr 23 10:03:43 2018 Thread-2: Mon Apr 23 10:03:47 2018 Thread-2: Mon Apr 23 10:03:51 2018 <threading> module The <threading> module implements in an object oriented way and treats every thread as an object. Therefore, it provides much more powerful, high-level support for threads than the module. This module is included with Python 2.4. Additional methods in the <threading> module The <threading> module comprises all the methods of the <_thread> module but it provides additional methods as well. The additional methods are as follows − threading.activeCount() − This method returns the number of thread objects that are active threading.currentThread() − This method returns the number of thread objects in the caller”s thread control. threading.enumerate() − This method returns a list of all thread objects that are currently active. For implementing threading, the <threading> module has the Thread class which provides the following methods − run() − The run() method is the entry point for a thread. start() − The start() method starts a thread by calling the run method. join([time]) − The join() waits for threads to terminate. isAlive() − The isAlive() method checks whether a thread is still executing. getName() − The getName() method returns the name of a thread. setName() − The setName() method sets the name of a thread. How to create threads using the <threading> module? In this section, we will learn how to create threads using the <threading> module. Follow these steps to create a new thread using the <threading> module − Step 1 − In this step, we need to define a new subclass of the Thread class. Step 2 − Then for adding additional arguments, we need to override the __init__(self [,args]) method. Step 3 − In this step, we need to override the run(self [,args]) method to implement what the thread should do when started. Now, after creating the new Thread subclass, we can create an instance of it and then start a new thread by invoking the start(), which in turn calls the run() method. Example Consider this example to learn how to generate a new thread by using the <threading> module. import threading import time exitFlag = 0 class myThread (threading.Thread): def __init__(self, threadID, name, counter): threading.Thread.__init__(self) self.threadID = threadID self.name = name self.counter = counter def run(self): print (“Starting ” + self.name) print_time(self.name, self.counter, 5) print (“Exiting ” + self.name) def print_time(threadName, delay, counter): while counter: if exitFlag: threadName.exit() time.sleep(delay) print (“%s: %s” % (threadName, time.ctime(time.time()))) counter -= 1 thread1 = myThread(1, “Thread-1”, 1) thread2 = myThread(2, “Thread-2”, 2) thread1.start() thread2.start() thread1.join() thread2.join() print (“Exiting Main Thread”) Starting Thread-1 Starting Thread-2 Output Now, consider the following output − Thread-1: Mon Apr 23 10:52:09 2018 Thread-1: Mon Apr 23 10:52:10 2018 Thread-2: Mon Apr 23 10:52:10 2018 Thread-1: Mon Apr 23 10:52:11 2018 Thread-1: Mon Apr 23 10:52:12 2018 Thread-2: Mon Apr 23 10:52:12 2018 Thread-1: Mon Apr 23 10:52:13 2018 Exiting Thread-1 Thread-2: Mon Apr 23 10:52:14 2018 Thread-2: Mon Apr 23 10:52:16 2018 Thread-2: Mon Apr 23 10:52:18 2018 Exiting Thread-2 Exiting Main Thread Python Program for Various Thread States There are five thread states – new, runnable, running, waiting and dead. Among these five Of these five, we will majorly focus on three states – running, waiting and dead. A thread gets its resources in the running state, waits for the resources in the waiting state; the final release of the resource, if executing and acquired is in the dead state. The following Python program with the help of