Assembly – Procedures

Assembly – Procedures ”; Previous Next Procedures or subroutines are very important in assembly language, as the assembly language programs tend to be large in size. Procedures are identified by a name. Following this name, the body of the procedure is described which performs a well-defined job. End of the procedure is indicated by a return statement. Syntax Following is the syntax to define a procedure − proc_name: procedure body … ret The procedure is called from another function by using the CALL instruction. The CALL instruction should have the name of the called procedure as an argument as shown below − CALL proc_name The called procedure returns the control to the calling procedure by using the RET instruction. Example Let us write a very simple procedure named sum that adds the variables stored in the ECX and EDX register and returns the sum in the EAX register − Live Demo section .text global _start ;must be declared for using gcc _start: ;tell linker entry point mov ecx,”4” sub ecx, ”0” mov edx, ”5” sub edx, ”0” call sum ;call sum procedure mov [res], eax mov ecx, msg mov edx, len mov ebx,1 ;file descriptor (stdout) mov eax,4 ;system call number (sys_write) int 0x80 ;call kernel mov ecx, res mov edx, 1 mov ebx, 1 ;file descriptor (stdout) mov eax, 4 ;system call number (sys_write) int 0x80 ;call kernel mov eax,1 ;system call number (sys_exit) int 0x80 ;call kernel sum: mov eax, ecx add eax, edx add eax, ”0” ret section .data msg db “The sum is:”, 0xA,0xD len equ $- msg segment .bss res resb 1 When the above code is compiled and executed, it produces the following result − The sum is: 9 Stacks Data Structure A stack is an array-like data structure in the memory in which data can be stored and removed from a location called the ”top” of the stack. The data that needs to be stored is ”pushed” into the stack and data to be retrieved is ”popped” out from the stack. Stack is a LIFO data structure, i.e., the data stored first is retrieved last. Assembly language provides two instructions for stack operations: PUSH and POP. These instructions have syntaxes like − PUSH operand POP address/register The memory space reserved in the stack segment is used for implementing stack. The registers SS and ESP (or SP) are used for implementing the stack. The top of the stack, which points to the last data item inserted into the stack is pointed to by the SS:ESP register, where the SS register points to the beginning of the stack segment and the SP (or ESP) gives the offset into the stack segment. The stack implementation has the following characteristics − Only words or doublewords could be saved into the stack, not a byte. The stack grows in the reverse direction, i.e., toward the lower memory address The top of the stack points to the last item inserted in the stack; it points to the lower byte of the last word inserted. As we discussed about storing the values of the registers in the stack before using them for some use; it can be done in following way − ; Save the AX and BX registers in the stack PUSH AX PUSH BX ; Use the registers for other purpose MOV AX, VALUE1 MOV BX, VALUE2 … MOV VALUE1, AX MOV VALUE2, BX ; Restore the original values POP BX POP AX Example The following program displays the entire ASCII character set. The main program calls a procedure named display, which displays the ASCII character set. section .text global _start ;must be declared for using gcc _start: ;tell linker entry point call display mov eax,1 ;system call number (sys_exit) int 0x80 ;call kernel display: mov ecx, 256 next: push ecx mov eax, 4 mov ebx, 1 mov ecx, achar mov edx, 1 int 80h pop ecx mov dx, [achar] cmp byte [achar], 0dh inc byte [achar] loop next ret section .data achar db ”0” When the above code is compiled and executed, it produces the following result − 0123456789:;?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[]^_`abcdefghijklmnopqrstuvwxyz{|} … … Print Page Previous Next Advertisements ”;

Assembly – Discussion

Discuss Assembly Programming ”; Previous Next Assembly language is a low-level programming language for a computer or other programmable device specific to a particular computer architecture in contrast to most high-level programming languages, which are generally portable across multiple systems. Assembly language is converted into executable machine code by a utility program referred to as an assembler like NASM, MASM, etc. Print Page Previous Next Advertisements ”;

Assembly – Arithmetic Instructions

Assembly – Arithmetic Instructions ”; Previous Next The INC Instruction The INC instruction is used for incrementing an operand by one. It works on a single operand that can be either in a register or in memory. Syntax The INC instruction has the following syntax − INC destination The operand destination could be an 8-bit, 16-bit or 32-bit operand. Example INC EBX ; Increments 32-bit register INC DL ; Increments 8-bit register INC [count] ; Increments the count variable The DEC Instruction The DEC instruction is used for decrementing an operand by one. It works on a single operand that can be either in a register or in memory. Syntax The DEC instruction has the following syntax − DEC destination The operand destination could be an 8-bit, 16-bit or 32-bit operand. Example segment .data count dw 0 value db 15 segment .text inc [count] dec [value] mov ebx, count inc word [ebx] mov esi, value dec byte [esi] The ADD and SUB Instructions The ADD and SUB instructions are used for performing simple addition/subtraction of binary data in byte, word and doubleword size, i.e., for adding or subtracting 8-bit, 16-bit or 32-bit operands, respectively. Syntax The ADD and SUB instructions have the following syntax − ADD/SUB destination, source The ADD/SUB instruction can take place between − Register to register Memory to register Register to memory Register to constant data Memory to constant data However, like other instructions, memory-to-memory operations are not possible using ADD/SUB instructions. An ADD or SUB operation sets or clears the overflow and carry flags. Example The following example will ask two digits from the user, store the digits in the EAX and EBX register, respectively, add the values, store the result in a memory location ”res” and finally display the result. SYS_EXIT equ 1 SYS_READ equ 3 SYS_WRITE equ 4 STDIN equ 0 STDOUT equ 1 segment .data msg1 db “Enter a digit “, 0xA,0xD len1 equ $- msg1 msg2 db “Please enter a second digit”, 0xA,0xD len2 equ $- msg2 msg3 db “The sum is: ” len3 equ $- msg3 segment .bss num1 resb 2 num2 resb 2 res resb 1 section .text global _start ;must be declared for using gcc _start: ;tell linker entry point mov eax, SYS_WRITE mov ebx, STDOUT mov ecx, msg1 mov edx, len1 int 0x80 mov eax, SYS_READ mov ebx, STDIN mov ecx, num1 mov edx, 2 int 0x80 mov eax, SYS_WRITE mov ebx, STDOUT mov ecx, msg2 mov edx, len2 int 0x80 mov eax, SYS_READ mov ebx, STDIN mov ecx, num2 mov edx, 2 int 0x80 mov eax, SYS_WRITE mov ebx, STDOUT mov ecx, msg3 mov edx, len3 int 0x80 ; moving the first number to eax register and second number to ebx ; and subtracting ascii ”0” to convert it into a decimal number mov eax, [num1] sub eax, ”0” mov ebx, [num2] sub ebx, ”0” ; add eax and ebx add eax, ebx ; add ”0” to to convert the sum from decimal to ASCII add eax, ”0” ; storing the sum in memory location res mov [res], eax ; print the sum mov eax, SYS_WRITE mov ebx, STDOUT mov ecx, res mov edx, 1 int 0x80 exit: mov eax, SYS_EXIT xor ebx, ebx int 0x80 When the above code is compiled and executed, it produces the following result − Enter a digit: 3 Please enter a second digit: 4 The sum is: 7 The program with hardcoded variables − Live Demo section .text global _start ;must be declared for using gcc _start: ;tell linker entry point mov eax,”3” sub eax, ”0” mov ebx, ”4” sub ebx, ”0” add eax, ebx add eax, ”0” mov [sum], eax mov ecx,msg mov edx, len mov ebx,1 ;file descriptor (stdout) mov eax,4 ;system call number (sys_write) int 0x80 ;call kernel mov ecx,sum mov edx, 1 mov ebx,1 ;file descriptor (stdout) mov eax,4 ;system call number (sys_write) int 0x80 ;call kernel mov eax,1 ;system call number (sys_exit) int 0x80 ;call kernel section .data msg db “The sum is:”, 0xA,0xD len equ $ – msg segment .bss sum resb 1 When the above code is compiled and executed, it produces the following result − The sum is: 7 The MUL/IMUL Instruction There are two instructions for multiplying binary data. The MUL (Multiply) instruction handles unsigned data and the IMUL (Integer Multiply) handles signed data. Both instructions affect the Carry and Overflow flag. Syntax The syntax for the MUL/IMUL instructions is as follows − MUL/IMUL multiplier Multiplicand in both cases will be in an accumulator, depending upon the size of the multiplicand and the multiplier and the generated product is also stored in two registers depending upon the size of the operands. Following section explains MUL instructions with three different cases − Sr.No. Scenarios 1 When two bytes are multiplied − The multiplicand is in the AL register, and the multiplier is a byte in the memory or in another register. The product is in AX. High-order 8 bits of the product is stored in AH and the low-order 8 bits are stored in AL. 2 When two one-word values are multiplied − The multiplicand should be in the AX register, and the multiplier is a word in memory or another register. For example, for an instruction like MUL DX, you must store the multiplier in DX and the multiplicand in AX. The resultant product is a doubleword, which will need two registers. The high-order (leftmost) portion gets stored in DX and the lower-order (rightmost) portion gets stored in AX. 3 When two doubleword values are multiplied − When two doubleword values are multiplied, the multiplicand should be in EAX and the multiplier is a doubleword value stored in memory or in another register. The product generated is stored in the EDX:EAX registers, i.e., the high order 32 bits gets stored in the EDX register and the low order 32-bits are stored in the EAX register. Example MOV AL, 10 MOV DL, 25 MUL DL … MOV DL, 0FFH

Assembly – Addressing Modes

Assembly – Addressing Modes ”; Previous Next Most assembly language instructions require operands to be processed. An operand address provides the location, where the data to be processed is stored. Some instructions do not require an operand, whereas some other instructions may require one, two, or three operands. When an instruction requires two operands, the first operand is generally the destination, which contains data in a register or memory location and the second operand is the source. Source contains either the data to be delivered (immediate addressing) or the address (in register or memory) of the data. Generally, the source data remains unaltered after the operation. The three basic modes of addressing are − Register addressing Immediate addressing Memory addressing Register Addressing In this addressing mode, a register contains the operand. Depending upon the instruction, the register may be the first operand, the second operand or both. For example, MOV DX, TAX_RATE ; Register in first operand MOV COUNT, CX ; Register in second operand MOV EAX, EBX ; Both the operands are in registers As processing data between registers does not involve memory, it provides fastest processing of data. Immediate Addressing An immediate operand has a constant value or an expression. When an instruction with two operands uses immediate addressing, the first operand may be a register or memory location, and the second operand is an immediate constant. The first operand defines the length of the data. For example, BYTE_VALUE DB 150 ; A byte value is defined WORD_VALUE DW 300 ; A word value is defined ADD BYTE_VALUE, 65 ; An immediate operand 65 is added MOV AX, 45H ; Immediate constant 45H is transferred to AX Direct Memory Addressing When operands are specified in memory addressing mode, direct access to main memory, usually to the data segment, is required. This way of addressing results in slower processing of data. To locate the exact location of data in memory, we need the segment start address, which is typically found in the DS register and an offset value. This offset value is also called effective address. In direct addressing mode, the offset value is specified directly as part of the instruction, usually indicated by the variable name. The assembler calculates the offset value and maintains a symbol table, which stores the offset values of all the variables used in the program. In direct memory addressing, one of the operands refers to a memory location and the other operand references a register. For example, ADD BYTE_VALUE, DL ; Adds the register in the memory location MOV BX, WORD_VALUE ; Operand from the memory is added to register Direct-Offset Addressing This addressing mode uses the arithmetic operators to modify an address. For example, look at the following definitions that define tables of data − BYTE_TABLE DB 14, 15, 22, 45 ; Tables of bytes WORD_TABLE DW 134, 345, 564, 123 ; Tables of words The following operations access data from the tables in the memory into registers − MOV CL, BYTE_TABLE[2] ; Gets the 3rd element of the BYTE_TABLE MOV CL, BYTE_TABLE + 2 ; Gets the 3rd element of the BYTE_TABLE MOV CX, WORD_TABLE[3] ; Gets the 4th element of the WORD_TABLE MOV CX, WORD_TABLE + 3 ; Gets the 4th element of the WORD_TABLE Indirect Memory Addressing This addressing mode utilizes the computer”s ability of Segment:Offset addressing. Generally, the base registers EBX, EBP (or BX, BP) and the index registers (DI, SI), coded within square brackets for memory references, are used for this purpose. Indirect addressing is generally used for variables containing several elements like, arrays. Starting address of the array is stored in, say, the EBX register. The following code snippet shows how to access different elements of the variable. MY_TABLE TIMES 10 DW 0 ; Allocates 10 words (2 bytes) each initialized to 0 MOV EBX, [MY_TABLE] ; Effective Address of MY_TABLE in EBX MOV [EBX], 110 ; MY_TABLE[0] = 110 ADD EBX, 2 ; EBX = EBX +2 MOV [EBX], 123 ; MY_TABLE[1] = 123 The MOV Instruction We have already used the MOV instruction that is used for moving data from one storage space to another. The MOV instruction takes two operands. Syntax The syntax of the MOV instruction is − MOV destination, source The MOV instruction may have one of the following five forms − MOV register, register MOV register, immediate MOV memory, immediate MOV register, memory MOV memory, register Please note that − Both the operands in MOV operation should be of same size The value of source operand remains unchanged The MOV instruction causes ambiguity at times. For example, look at the statements − MOV EBX, [MY_TABLE] ; Effective Address of MY_TABLE in EBX MOV [EBX], 110 ; MY_TABLE[0] = 110 It is not clear whether you want to move a byte equivalent or word equivalent of the number 110. In such cases, it is wise to use a type specifier. Following table shows some of the common type specifiers − Type Specifier Bytes addressed BYTE 1 WORD 2 DWORD 4 QWORD 8 TBYTE 10 Example The following program illustrates some of the concepts discussed above. It stores a name ”Zara Ali” in the data section of the memory, then changes its value to another name ”Nuha Ali” programmatically and displays both the names. Live Demo section .text global _start ;must be declared for linker (ld) _start: ;tell linker entry point ;writing the name ”Zara Ali” mov edx,9 ;message length mov ecx, name ;message to write mov ebx,1 ;file descriptor (stdout) mov eax,4 ;system call number (sys_write) int 0x80 ;call kernel mov [name], dword ”Nuha” ; Changed the name to Nuha Ali ;writing the name ”Nuha Ali” mov edx,8 ;message length mov ecx,name ;message to write mov ebx,1 ;file descriptor (stdout) mov eax,4 ;system call number (sys_write) int 0x80 ;call kernel mov eax,1 ;system call number (sys_exit) int 0x80 ;call kernel section .data name db ”Zara Ali ” When the above code is compiled and executed, it produces

Assembly – Constants

Assembly – Constants ”; Previous Next There are several directives provided by NASM that define constants. We have already used the EQU directive in previous chapters. We will particularly discuss three directives − EQU %assign %define The EQU Directive The EQU directive is used for defining constants. The syntax of the EQU directive is as follows − CONSTANT_NAME EQU expression For example, TOTAL_STUDENTS equ 50 You can then use this constant value in your code, like − mov ecx, TOTAL_STUDENTS cmp eax, TOTAL_STUDENTS The operand of an EQU statement can be an expression − LENGTH equ 20 WIDTH equ 10 AREA equ length * width Above code segment would define AREA as 200. Example The following example illustrates the use of the EQU directive − Live Demo SYS_EXIT equ 1 SYS_WRITE equ 4 STDIN equ 0 STDOUT equ 1 section .text global _start ;must be declared for using gcc _start: ;tell linker entry point mov eax, SYS_WRITE mov ebx, STDOUT mov ecx, msg1 mov edx, len1 int 0x80 mov eax, SYS_WRITE mov ebx, STDOUT mov ecx, msg2 mov edx, len2 int 0x80 mov eax, SYS_WRITE mov ebx, STDOUT mov ecx, msg3 mov edx, len3 int 0x80 mov eax,SYS_EXIT ;system call number (sys_exit) int 0x80 ;call kernel section .data msg1 db ”Hello, programmers!”,0xA,0xD len1 equ $ – msg1 msg2 db ”Welcome to the world of,”, 0xA,0xD len2 equ $ – msg2 msg3 db ”Linux assembly programming! ” len3 equ $- msg3 When the above code is compiled and executed, it produces the following result − Hello, programmers! Welcome to the world of, Linux assembly programming! The %assign Directive The %assign directive can be used to define numeric constants like the EQU directive. This directive allows redefinition. For example, you may define the constant TOTAL as − %assign TOTAL 10 Later in the code, you can redefine it as − %assign TOTAL 20 This directive is case-sensitive. The %define Directive The %define directive allows defining both numeric and string constants. This directive is similar to the #define in C. For example, you may define the constant PTR as − %define PTR [EBP+4] The above code replaces PTR by [EBP+4]. This directive also allows redefinition and it is case-sensitive. Print Page Previous Next Advertisements ”;

Assembly – Arrays

Assembly – Arrays ”; Previous Next We have already discussed that the data definition directives to the assembler are used for allocating storage for variables. The variable could also be initialized with some specific value. The initialized value could be specified in hexadecimal, decimal or binary form. For example, we can define a word variable ”months” in either of the following way − MONTHS DW 12 MONTHS DW 0CH MONTHS DW 0110B The data definition directives can also be used for defining a one-dimensional array. Let us define a one-dimensional array of numbers. NUMBERS DW 34, 45, 56, 67, 75, 89 The above definition declares an array of six words each initialized with the numbers 34, 45, 56, 67, 75, 89. This allocates 2×6 = 12 bytes of consecutive memory space. The symbolic address of the first number will be NUMBERS and that of the second number will be NUMBERS + 2 and so on. Let us take up another example. You can define an array named inventory of size 8, and initialize all the values with zero, as − INVENTORY DW 0 DW 0 DW 0 DW 0 DW 0 DW 0 DW 0 DW 0 Which can be abbreviated as − INVENTORY DW 0, 0 , 0 , 0 , 0 , 0 , 0 , 0 The TIMES directive can also be used for multiple initializations to the same value. Using TIMES, the INVENTORY array can be defined as: INVENTORY TIMES 8 DW 0 Example The following example demonstrates the above concepts by defining a 3-element array x, which stores three values: 2, 3 and 4. It adds the values in the array and displays the sum 9 − Live Demo section .text global _start ;must be declared for linker (ld) _start: mov eax,3 ;number bytes to be summed mov ebx,0 ;EBX will store the sum mov ecx, x ;ECX will point to the current element to be summed top: add ebx, [ecx] add ecx,1 ;move pointer to next element dec eax ;decrement counter jnz top ;if counter not 0, then loop again done: add ebx, ”0” mov [sum], ebx ;done, store result in “sum” display: mov edx,1 ;message length mov ecx, sum ;message to write mov ebx, 1 ;file descriptor (stdout) mov eax, 4 ;system call number (sys_write) int 0x80 ;call kernel mov eax, 1 ;system call number (sys_exit) int 0x80 ;call kernel section .data global x x: db 2 db 4 db 3 sum: db 0 When the above code is compiled and executed, it produces the following result − 9 Print Page Previous Next Advertisements ”;

Assembly – Basic Syntax

Assembly – Basic Syntax ”; Previous Next An assembly program can be divided into three sections − The data section, The bss section, and The text section. The data Section The data section is used for declaring initialized data or constants. This data does not change at runtime. You can declare various constant values, file names, or buffer size, etc., in this section. The syntax for declaring data section is − section.data The bss Section The bss section is used for declaring variables. The syntax for declaring bss section is − section.bss The text section The text section is used for keeping the actual code. This section must begin with the declaration global _start, which tells the kernel where the program execution begins. The syntax for declaring text section is − section.text global _start _start: Comments Assembly language comment begins with a semicolon (;). It may contain any printable character including blank. It can appear on a line by itself, like − ; This program displays a message on screen or, on the same line along with an instruction, like − add eax, ebx ; adds ebx to eax Assembly Language Statements Assembly language programs consist of three types of statements − Executable instructions or instructions, Assembler directives or pseudo-ops, and Macros. The executable instructions or simply instructions tell the processor what to do. Each instruction consists of an operation code (opcode). Each executable instruction generates one machine language instruction. The assembler directives or pseudo-ops tell the assembler about the various aspects of the assembly process. These are non-executable and do not generate machine language instructions. Macros are basically a text substitution mechanism. Syntax of Assembly Language Statements Assembly language statements are entered one statement per line. Each statement follows the following format − [label] mnemonic [operands] [;comment] The fields in the square brackets are optional. A basic instruction has two parts, the first one is the name of the instruction (or the mnemonic), which is to be executed, and the second are the operands or the parameters of the command. Following are some examples of typical assembly language statements − INC COUNT ; Increment the memory variable COUNT MOV TOTAL, 48 ; Transfer the value 48 in the ; memory variable TOTAL ADD AH, BH ; Add the content of the ; BH register into the AH register AND MASK1, 128 ; Perform AND operation on the ; variable MASK1 and 128 ADD MARKS, 10 ; Add 10 to the variable MARKS MOV AL, 10 ; Transfer the value 10 to the AL register The Hello World Program in Assembly The following assembly language code displays the string ”Hello World” on the screen − Live Demo section .text global _start ;must be declared for linker (ld) _start: ;tells linker entry point mov edx,len ;message length mov ecx,msg ;message to write mov ebx,1 ;file descriptor (stdout) mov eax,4 ;system call number (sys_write) int 0x80 ;call kernel mov eax,1 ;system call number (sys_exit) int 0x80 ;call kernel section .data msg db ”Hello, world!”, 0xa ;string to be printed len equ $ – msg ;length of the string When the above code is compiled and executed, it produces the following result − Hello, world! Compiling and Linking an Assembly Program in NASM Make sure you have set the path of nasm and ld binaries in your PATH environment variable. Now, take the following steps for compiling and linking the above program − Type the above code using a text editor and save it as hello.asm. Make sure that you are in the same directory as where you saved hello.asm. To assemble the program, type nasm -f elf hello.asm If there is any error, you will be prompted about that at this stage. Otherwise, an object file of your program named hello.o will be created. To link the object file and create an executable file named hello, type ld -m elf_i386 -s -o hello hello.o Execute the program by typing ./hello If you have done everything correctly, it will display ”Hello, world!” on the screen. Print Page Previous Next Advertisements ”;

Assembly – Introduction

Assembly – Introduction ”; Previous Next What is Assembly Language? Each personal computer has a microprocessor that manages the computer”s arithmetical, logical, and control activities. Each family of processors has its own set of instructions for handling various operations such as getting input from keyboard, displaying information on screen and performing various other jobs. These set of instructions are called ”machine language instructions”. A processor understands only machine language instructions, which are strings of 1”s and 0”s. However, machine language is too obscure and complex for using in software development. So, the low-level assembly language is designed for a specific family of processors that represents various instructions in symbolic code and a more understandable form. Advantages of Assembly Language Having an understanding of assembly language makes one aware of − How programs interface with OS, processor, and BIOS; How data is represented in memory and other external devices; How the processor accesses and executes instruction; How instructions access and process data; How a program accesses external devices. Other advantages of using assembly language are − It requires less memory and execution time; It allows hardware-specific complex jobs in an easier way; It is suitable for time-critical jobs; It is most suitable for writing interrupt service routines and other memory resident programs. Basic Features of PC Hardware The main internal hardware of a PC consists of processor, memory, and registers. Registers are processor components that hold data and address. To execute a program, the system copies it from the external device into the internal memory. The processor executes the program instructions. The fundamental unit of computer storage is a bit; it could be ON (1) or OFF (0) and a group of 8 related bits makes a byte on most of the modern computers. So, the parity bit is used to make the number of bits in a byte odd. If the parity is even, the system assumes that there had been a parity error (though rare), which might have been caused due to hardware fault or electrical disturbance. The processor supports the following data sizes − Word: a 2-byte data item Doubleword: a 4-byte (32 bit) data item Quadword: an 8-byte (64 bit) data item Paragraph: a 16-byte (128 bit) area Kilobyte: 1024 bytes Megabyte: 1,048,576 bytes Binary Number System Every number system uses positional notation, i.e., each position in which a digit is written has a different positional value. Each position is power of the base, which is 2 for binary number system, and these powers begin at 0 and increase by 1. The following table shows the positional values for an 8-bit binary number, where all bits are set ON. Bit value 1 1 1 1 1 1 1 1 Position value as a power of base 2 128 64 32 16 8 4 2 1 Bit number 7 6 5 4 3 2 1 0 The value of a binary number is based on the presence of 1 bits and their positional value. So, the value of a given binary number is − 1 + 2 + 4 + 8 +16 + 32 + 64 + 128 = 255 which is same as 28 – 1. Hexadecimal Number System Hexadecimal number system uses base 16. The digits in this system range from 0 to 15. By convention, the letters A through F is used to represent the hexadecimal digits corresponding to decimal values 10 through 15. Hexadecimal numbers in computing is used for abbreviating lengthy binary representations. Basically, hexadecimal number system represents a binary data by dividing each byte in half and expressing the value of each half-byte. The following table provides the decimal, binary, and hexadecimal equivalents − Decimal number Binary representation Hexadecimal representation 0 0 0 1 1 1 2 10 2 3 11 3 4 100 4 5 101 5 6 110 6 7 111 7 8 1000 8 9 1001 9 10 1010 A 11 1011 B 12 1100 C 13 1101 D 14 1110 E 15 1111 F To convert a binary number to its hexadecimal equivalent, break it into groups of 4 consecutive groups each, starting from the right, and write those groups over the corresponding digits of the hexadecimal number. Example − Binary number 1000 1100 1101 0001 is equivalent to hexadecimal – 8CD1 To convert a hexadecimal number to binary, just write each hexadecimal digit into its 4-digit binary equivalent. Example − Hexadecimal number FAD8 is equivalent to binary – 1111 1010 1101 1000 Binary Arithmetic The following table illustrates four simple rules for binary addition − (i) (ii) (iii) (iv) 1 0 1 1 1 +0 +0 +1 +1 =0 =1 =10 =11 Rules (iii) and (iv) show a carry of a 1-bit into the next left position. Example Decimal Binary 60 00111100 +42 00101010 102 01100110 A negative binary value is expressed in two”s complement notation. According to this rule, to convert a binary number to its negative value is to reverse its bit values and add 1. Example Number 53 00110101 Reverse the bits 11001010 Add 1 00000001 Number -53 11001011 To subtract one value from another, convert the number being subtracted to two”s complement format and add the numbers. Example Subtract 42 from 53 Number 53 00110101 Number 42 00101010 Reverse the bits of 42 11010101 Add 1 00000001 Number -42 11010110 53 – 42 = 11 00001011 Overflow of the last 1 bit is lost. Addressing Data in Memory The process through which the processor controls the execution of instructions is referred as the fetch-decode-execute cycle or the execution cycle. It consists of three continuous steps − Fetching the instruction from memory Decoding or identifying the instruction Executing the instruction The processor may access one or more bytes of memory at a time. Let us consider a hexadecimal number 0725H. This number will require two bytes of memory. The high-order byte or most significant byte is 07 and the low-order byte is 25. The processor stores data in reverse-byte sequence,

Assembly – File Management

Assembly – File Management ”; Previous Next The system considers any input or output data as stream of bytes. There are three standard file streams − Standard input (stdin), Standard output (stdout), and Standard error (stderr). File Descriptor A file descriptor is a 16-bit integer assigned to a file as a file id. When a new file is created or an existing file is opened, the file descriptor is used for accessing the file. File descriptor of the standard file streams – stdin, stdout and stderr are 0, 1 and 2, respectively. File Pointer A file pointer specifies the location for a subsequent read/write operation in the file in terms of bytes. Each file is considered as a sequence of bytes. Each open file is associated with a file pointer that specifies an offset in bytes, relative to the beginning of the file. When a file is opened, the file pointer is set to zero. File Handling System Calls The following table briefly describes the system calls related to file handling − %eax Name %ebx %ecx %edx 2 sys_fork struct pt_regs – – 3 sys_read unsigned int char * size_t 4 sys_write unsigned int const char * size_t 5 sys_open const char * int int 6 sys_close unsigned int – – 8 sys_creat const char * int – 19 sys_lseek unsigned int off_t unsigned int The steps required for using the system calls are same, as we discussed earlier − Put the system call number in the EAX register. Store the arguments to the system call in the registers EBX, ECX, etc. Call the relevant interrupt (80h). The result is usually returned in the EAX register. Creating and Opening a File For creating and opening a file, perform the following tasks − Put the system call sys_creat() number 8, in the EAX register. Put the filename in the EBX register. Put the file permissions in the ECX register. The system call returns the file descriptor of the created file in the EAX register, in case of error, the error code is in the EAX register. Opening an Existing File For opening an existing file, perform the following tasks − Put the system call sys_open() number 5, in the EAX register. Put the filename in the EBX register. Put the file access mode in the ECX register. Put the file permissions in the EDX register. The system call returns the file descriptor of the created file in the EAX register, in case of error, the error code is in the EAX register. Among the file access modes, most commonly used are: read-only (0), write-only (1), and read-write (2). Reading from a File For reading from a file, perform the following tasks − Put the system call sys_read() number 3, in the EAX register. Put the file descriptor in the EBX register. Put the pointer to the input buffer in the ECX register. Put the buffer size, i.e., the number of bytes to read, in the EDX register. The system call returns the number of bytes read in the EAX register, in case of error, the error code is in the EAX register. Writing to a File For writing to a file, perform the following tasks − Put the system call sys_write() number 4, in the EAX register. Put the file descriptor in the EBX register. Put the pointer to the output buffer in the ECX register. Put the buffer size, i.e., the number of bytes to write, in the EDX register. The system call returns the actual number of bytes written in the EAX register, in case of error, the error code is in the EAX register. Closing a File For closing a file, perform the following tasks − Put the system call sys_close() number 6, in the EAX register. Put the file descriptor in the EBX register. The system call returns, in case of error, the error code in the EAX register. Updating a File For updating a file, perform the following tasks − Put the system call sys_lseek () number 19, in the EAX register. Put the file descriptor in the EBX register. Put the offset value in the ECX register. Put the reference position for the offset in the EDX register. The reference position could be: Beginning of file – value 0 Current position – value 1 End of file – value 2 The system call returns, in case of error, the error code in the EAX register. Example The following program creates and opens a file named myfile.txt, and writes a text ”Welcome to Tutorials Point” in this file. Next, the program reads from the file and stores the data into a buffer named info. Lastly, it displays the text as stored in info. section .text global _start ;must be declared for using gcc _start: ;tell linker entry point ;create the file mov eax, 8 mov ebx, file_name mov ecx, 0777 ;read, write and execute by all int 0x80 ;call kernel mov [fd_out], eax ; write into the file mov edx,len ;number of bytes mov ecx, msg ;message to write mov ebx, [fd_out] ;file descriptor mov eax,4 ;system call number (sys_write) int 0x80 ;call kernel ; close the file mov eax, 6 mov ebx, [fd_out] ; write the message indicating end of file write mov eax, 4 mov ebx, 1 mov ecx, msg_done mov edx, len_done int 0x80 ;open the file for reading mov eax, 5 mov ebx, file_name mov ecx, 0 ;for read only access mov edx, 0777 ;read, write and execute by all int 0x80 mov [fd_in], eax ;read from file mov eax, 3 mov ebx, [fd_in] mov ecx, info mov edx, 26 int 0x80 ; close the file mov eax, 6 mov ebx, [fd_in] int 0x80 ; print the info mov eax, 4 mov ebx, 1 mov ecx, info mov edx, 26 int 0x80 mov eax,1 ;system call number (sys_exit) int 0x80 ;call kernel section .data file_name db ”myfile.txt” msg db ”Welcome to Tutorials Point” len equ $-msg msg_done db

Assembly – System Calls

Assembly – System Calls ”; Previous Next System calls are APIs for the interface between the user space and the kernel space. We have already used the system calls. sys_write and sys_exit, for writing into the screen and exiting from the program, respectively. Linux System Calls You can make use of Linux system calls in your assembly programs. You need to take the following steps for using Linux system calls in your program − Put the system call number in the EAX register. Store the arguments to the system call in the registers EBX, ECX, etc. Call the relevant interrupt (80h). The result is usually returned in the EAX register. There are six registers that store the arguments of the system call used. These are the EBX, ECX, EDX, ESI, EDI, and EBP. These registers take the consecutive arguments, starting with the EBX register. If there are more than six arguments, then the memory location of the first argument is stored in the EBX register. The following code snippet shows the use of the system call sys_exit − mov eax,1 ; system call number (sys_exit) int 0x80 ; call kernel The following code snippet shows the use of the system call sys_write − mov edx,4 ; message length mov ecx,msg ; message to write mov ebx,1 ; file descriptor (stdout) mov eax,4 ; system call number (sys_write) int 0x80 ; call kernel All the syscalls are listed in /usr/include/asm/unistd.h, together with their numbers (the value to put in EAX before you call int 80h). The following table shows some of the system calls used in this tutorial − %eax Name %ebx %ecx %edx %esx %edi 1 sys_exit int – – – – 2 sys_fork struct pt_regs – – – – 3 sys_read unsigned int char * size_t – – 4 sys_write unsigned int const char * size_t – – 5 sys_open const char * int int – – 6 sys_close unsigned int – – – – Example The following example reads a number from the keyboard and displays it on the screen − Live Demo section .data ;Data segment userMsg db ”Please enter a number: ” ;Ask the user to enter a number lenUserMsg equ $-userMsg ;The length of the message dispMsg db ”You have entered: ” lenDispMsg equ $-dispMsg section .bss ;Uninitialized data num resb 5 section .text ;Code Segment global _start _start: ;User prompt mov eax, 4 mov ebx, 1 mov ecx, userMsg mov edx, lenUserMsg int 80h ;Read and store the user input mov eax, 3 mov ebx, 2 mov ecx, num mov edx, 5 ;5 bytes (numeric, 1 for sign) of that information int 80h ;Output the message ”The entered number is: ” mov eax, 4 mov ebx, 1 mov ecx, dispMsg mov edx, lenDispMsg int 80h ;Output the number entered mov eax, 4 mov ebx, 1 mov ecx, num mov edx, 5 int 80h ; Exit code mov eax, 1 mov ebx, 0 int 80h When the above code is compiled and executed, it produces the following result − Please enter a number: 1234 You have entered:1234 Print Page Previous Next Advertisements ”;