Assembly – File Management

Assembly – File Management ”; Previous Next The system considers any input or output data as stream of bytes. There are three standard file streams − Standard input (stdin), Standard output (stdout), and Standard error (stderr). File Descriptor A file descriptor is a 16-bit integer assigned to a file as a file id. When a new file is created or an existing file is opened, the file descriptor is used for accessing the file. File descriptor of the standard file streams – stdin, stdout and stderr are 0, 1 and 2, respectively. File Pointer A file pointer specifies the location for a subsequent read/write operation in the file in terms of bytes. Each file is considered as a sequence of bytes. Each open file is associated with a file pointer that specifies an offset in bytes, relative to the beginning of the file. When a file is opened, the file pointer is set to zero. File Handling System Calls The following table briefly describes the system calls related to file handling − %eax Name %ebx %ecx %edx 2 sys_fork struct pt_regs – – 3 sys_read unsigned int char * size_t 4 sys_write unsigned int const char * size_t 5 sys_open const char * int int 6 sys_close unsigned int – – 8 sys_creat const char * int – 19 sys_lseek unsigned int off_t unsigned int The steps required for using the system calls are same, as we discussed earlier − Put the system call number in the EAX register. Store the arguments to the system call in the registers EBX, ECX, etc. Call the relevant interrupt (80h). The result is usually returned in the EAX register. Creating and Opening a File For creating and opening a file, perform the following tasks − Put the system call sys_creat() number 8, in the EAX register. Put the filename in the EBX register. Put the file permissions in the ECX register. The system call returns the file descriptor of the created file in the EAX register, in case of error, the error code is in the EAX register. Opening an Existing File For opening an existing file, perform the following tasks − Put the system call sys_open() number 5, in the EAX register. Put the filename in the EBX register. Put the file access mode in the ECX register. Put the file permissions in the EDX register. The system call returns the file descriptor of the created file in the EAX register, in case of error, the error code is in the EAX register. Among the file access modes, most commonly used are: read-only (0), write-only (1), and read-write (2). Reading from a File For reading from a file, perform the following tasks − Put the system call sys_read() number 3, in the EAX register. Put the file descriptor in the EBX register. Put the pointer to the input buffer in the ECX register. Put the buffer size, i.e., the number of bytes to read, in the EDX register. The system call returns the number of bytes read in the EAX register, in case of error, the error code is in the EAX register. Writing to a File For writing to a file, perform the following tasks − Put the system call sys_write() number 4, in the EAX register. Put the file descriptor in the EBX register. Put the pointer to the output buffer in the ECX register. Put the buffer size, i.e., the number of bytes to write, in the EDX register. The system call returns the actual number of bytes written in the EAX register, in case of error, the error code is in the EAX register. Closing a File For closing a file, perform the following tasks − Put the system call sys_close() number 6, in the EAX register. Put the file descriptor in the EBX register. The system call returns, in case of error, the error code in the EAX register. Updating a File For updating a file, perform the following tasks − Put the system call sys_lseek () number 19, in the EAX register. Put the file descriptor in the EBX register. Put the offset value in the ECX register. Put the reference position for the offset in the EDX register. The reference position could be: Beginning of file – value 0 Current position – value 1 End of file – value 2 The system call returns, in case of error, the error code in the EAX register. Example The following program creates and opens a file named myfile.txt, and writes a text ”Welcome to Tutorials Point” in this file. Next, the program reads from the file and stores the data into a buffer named info. Lastly, it displays the text as stored in info. section .text global _start ;must be declared for using gcc _start: ;tell linker entry point ;create the file mov eax, 8 mov ebx, file_name mov ecx, 0777 ;read, write and execute by all int 0x80 ;call kernel mov [fd_out], eax ; write into the file mov edx,len ;number of bytes mov ecx, msg ;message to write mov ebx, [fd_out] ;file descriptor mov eax,4 ;system call number (sys_write) int 0x80 ;call kernel ; close the file mov eax, 6 mov ebx, [fd_out] ; write the message indicating end of file write mov eax, 4 mov ebx, 1 mov ecx, msg_done mov edx, len_done int 0x80 ;open the file for reading mov eax, 5 mov ebx, file_name mov ecx, 0 ;for read only access mov edx, 0777 ;read, write and execute by all int 0x80 mov [fd_in], eax ;read from file mov eax, 3 mov ebx, [fd_in] mov ecx, info mov edx, 26 int 0x80 ; close the file mov eax, 6 mov ebx, [fd_in] int 0x80 ; print the info mov eax, 4 mov ebx, 1 mov ecx, info mov edx, 26 int 0x80 mov eax,1 ;system call number (sys_exit) int 0x80 ;call kernel section .data file_name db ”myfile.txt” msg db ”Welcome to Tutorials Point” len equ $-msg msg_done db

Assembly – System Calls

Assembly – System Calls ”; Previous Next System calls are APIs for the interface between the user space and the kernel space. We have already used the system calls. sys_write and sys_exit, for writing into the screen and exiting from the program, respectively. Linux System Calls You can make use of Linux system calls in your assembly programs. You need to take the following steps for using Linux system calls in your program − Put the system call number in the EAX register. Store the arguments to the system call in the registers EBX, ECX, etc. Call the relevant interrupt (80h). The result is usually returned in the EAX register. There are six registers that store the arguments of the system call used. These are the EBX, ECX, EDX, ESI, EDI, and EBP. These registers take the consecutive arguments, starting with the EBX register. If there are more than six arguments, then the memory location of the first argument is stored in the EBX register. The following code snippet shows the use of the system call sys_exit − mov eax,1 ; system call number (sys_exit) int 0x80 ; call kernel The following code snippet shows the use of the system call sys_write − mov edx,4 ; message length mov ecx,msg ; message to write mov ebx,1 ; file descriptor (stdout) mov eax,4 ; system call number (sys_write) int 0x80 ; call kernel All the syscalls are listed in /usr/include/asm/unistd.h, together with their numbers (the value to put in EAX before you call int 80h). The following table shows some of the system calls used in this tutorial − %eax Name %ebx %ecx %edx %esx %edi 1 sys_exit int – – – – 2 sys_fork struct pt_regs – – – – 3 sys_read unsigned int char * size_t – – 4 sys_write unsigned int const char * size_t – – 5 sys_open const char * int int – – 6 sys_close unsigned int – – – – Example The following example reads a number from the keyboard and displays it on the screen − Live Demo section .data ;Data segment userMsg db ”Please enter a number: ” ;Ask the user to enter a number lenUserMsg equ $-userMsg ;The length of the message dispMsg db ”You have entered: ” lenDispMsg equ $-dispMsg section .bss ;Uninitialized data num resb 5 section .text ;Code Segment global _start _start: ;User prompt mov eax, 4 mov ebx, 1 mov ecx, userMsg mov edx, lenUserMsg int 80h ;Read and store the user input mov eax, 3 mov ebx, 2 mov ecx, num mov edx, 5 ;5 bytes (numeric, 1 for sign) of that information int 80h ;Output the message ”The entered number is: ” mov eax, 4 mov ebx, 1 mov ecx, dispMsg mov edx, lenDispMsg int 80h ;Output the number entered mov eax, 4 mov ebx, 1 mov ecx, num mov edx, 5 int 80h ; Exit code mov eax, 1 mov ebx, 0 int 80h When the above code is compiled and executed, it produces the following result − Please enter a number: 1234 You have entered:1234 Print Page Previous Next Advertisements ”;

Assembly – Environment Setup

Assembly – Environment Setup ”; Previous Next Local Environment Setup Assembly language is dependent upon the instruction set and the architecture of the processor. In this tutorial, we focus on Intel-32 processors like Pentium. To follow this tutorial, you will need − An IBM PC or any equivalent compatible computer A copy of Linux operating system A copy of NASM assembler program There are many good assembler programs, such as − Microsoft Assembler (MASM) Borland Turbo Assembler (TASM) The GNU assembler (GAS) We will use the NASM assembler, as it is − Free. You can download it from various web sources. Well documented and you will get lots of information on net. Could be used on both Linux and Windows. Installing NASM If you select “Development Tools” while installing Linux, you may get NASM installed along with the Linux operating system and you do not need to download and install it separately. For checking whether you already have NASM installed, take the following steps − Open a Linux terminal. Type whereis nasm and press ENTER. If it is already installed, then a line like, nasm: /usr/bin/nasm appears. Otherwise, you will see just nasm:, then you need to install NASM. To install NASM, take the following steps − Check The netwide assembler (NASM) website for the latest version. Download the Linux source archive nasm-X.XX.ta.gz, where X.XX is the NASM version number in the archive. Unpack the archive into a directory which creates a subdirectory nasm-X. XX. cd to nasm-X.XX and type ./configure. This shell script will find the best C compiler to use and set up Makefiles accordingly. Type make to build the nasm and ndisasm binaries. Type make install to install nasm and ndisasm in /usr/local/bin and to install the man pages. This should install NASM on your system. Alternatively, you can use an RPM distribution for the Fedora Linux. This version is simpler to install, just double-click the RPM file. Print Page Previous Next Advertisements ”;

Assembly – Memory Segments

Assembly – Memory Segments ”; Previous Next We have already discussed the three sections of an assembly program. These sections represent various memory segments as well. Interestingly, if you replace the section keyword with segment, you will get the same result. Try the following code − Live Demo segment .text ;code segment global _start ;must be declared for linker _start: ;tell linker entry point mov edx,len ;message length mov ecx,msg ;message to write mov ebx,1 ;file descriptor (stdout) mov eax,4 ;system call number (sys_write) int 0x80 ;call kernel mov eax,1 ;system call number (sys_exit) int 0x80 ;call kernel segment .data ;data segment msg db ”Hello, world!”,0xa ;our dear string len equ $ – msg ;length of our dear string When the above code is compiled and executed, it produces the following result − Hello, world! Memory Segments A segmented memory model divides the system memory into groups of independent segments referenced by pointers located in the segment registers. Each segment is used to contain a specific type of data. One segment is used to contain instruction codes, another segment stores the data elements, and a third segment keeps the program stack. In the light of the above discussion, we can specify various memory segments as − Data segment − It is represented by .data section and the .bss. The .data section is used to declare the memory region, where data elements are stored for the program. This section cannot be expanded after the data elements are declared, and it remains static throughout the program. The .bss section is also a static memory section that contains buffers for data to be declared later in the program. This buffer memory is zero-filled. Code segment − It is represented by .text section. This defines an area in memory that stores the instruction codes. This is also a fixed area. Stack − This segment contains data values passed to functions and procedures within the program. Print Page Previous Next Advertisements ”;

Assembly – Logical Instructions

Assembly – Logical Instructions ”; Previous Next The processor instruction set provides the instructions AND, OR, XOR, TEST, and NOT Boolean logic, which tests, sets, and clears the bits according to the need of the program. The format for these instructions − Sr.No. Instruction Format 1 AND AND operand1, operand2 2 OR OR operand1, operand2 3 XOR XOR operand1, operand2 4 TEST TEST operand1, operand2 5 NOT NOT operand1 The first operand in all the cases could be either in register or in memory. The second operand could be either in register/memory or an immediate (constant) value. However, memory-to-memory operations are not possible. These instructions compare or match bits of the operands and set the CF, OF, PF, SF and ZF flags. The AND Instruction The AND instruction is used for supporting logical expressions by performing bitwise AND operation. The bitwise AND operation returns 1, if the matching bits from both the operands are 1, otherwise it returns 0. For example − Operand1: 0101 Operand2: 0011 —————————- After AND -> Operand1: 0001 The AND operation can be used for clearing one or more bits. For example, say the BL register contains 0011 1010. If you need to clear the high-order bits to zero, you AND it with 0FH. AND BL, 0FH ; This sets BL to 0000 1010 Let”s take up another example. If you want to check whether a given number is odd or even, a simple test would be to check the least significant bit of the number. If this is 1, the number is odd, else the number is even. Assuming the number is in AL register, we can write − AND AL, 01H ; ANDing with 0000 0001 JZ EVEN_NUMBER The following program illustrates this − Example Live Demo section .text global _start ;must be declared for using gcc _start: ;tell linker entry point mov ax, 8h ;getting 8 in the ax and ax, 1 ;and ax with 1 jz evnn mov eax, 4 ;system call number (sys_write) mov ebx, 1 ;file descriptor (stdout) mov ecx, odd_msg ;message to write mov edx, len2 ;length of message int 0x80 ;call kernel jmp outprog evnn: mov ah, 09h mov eax, 4 ;system call number (sys_write) mov ebx, 1 ;file descriptor (stdout) mov ecx, even_msg ;message to write mov edx, len1 ;length of message int 0x80 ;call kernel outprog: mov eax,1 ;system call number (sys_exit) int 0x80 ;call kernel section .data even_msg db ”Even Number!” ;message showing even number len1 equ $ – even_msg odd_msg db ”Odd Number!” ;message showing odd number len2 equ $ – odd_msg When the above code is compiled and executed, it produces the following result − Even Number! Change the value in the ax register with an odd digit, like − mov ax, 9h ; getting 9 in the ax The program would display: Odd Number! Similarly to clear the entire register you can AND it with 00H. The OR Instruction The OR instruction is used for supporting logical expression by performing bitwise OR operation. The bitwise OR operator returns 1, if the matching bits from either or both operands are one. It returns 0, if both the bits are zero. For example, Operand1: 0101 Operand2: 0011 —————————- After OR -> Operand1: 0111 The OR operation can be used for setting one or more bits. For example, let us assume the AL register contains 0011 1010, you need to set the four low-order bits, you can OR it with a value 0000 1111, i.e., FH. OR BL, 0FH ; This sets BL to 0011 1111 Example The following example demonstrates the OR instruction. Let us store the value 5 and 3 in the AL and the BL registers, respectively, then the instruction, OR AL, BL should store 7 in the AL register − Live Demo section .text global _start ;must be declared for using gcc _start: ;tell linker entry point mov al, 5 ;getting 5 in the al mov bl, 3 ;getting 3 in the bl or al, bl ;or al and bl registers, result should be 7 add al, byte ”0” ;converting decimal to ascii mov [result], al mov eax, 4 mov ebx, 1 mov ecx, result mov edx, 1 int 0x80 outprog: mov eax,1 ;system call number (sys_exit) int 0x80 ;call kernel section .bss result resb 1 When the above code is compiled and executed, it produces the following result − 7 The XOR Instruction The XOR instruction implements the bitwise XOR operation. The XOR operation sets the resultant bit to 1, if and only if the bits from the operands are different. If the bits from the operands are same (both 0 or both 1), the resultant bit is cleared to 0. For example, Operand1: 0101 Operand2: 0011 —————————- After XOR -> Operand1: 0110 XORing an operand with itself changes the operand to 0. This is used to clear a register. XOR EAX, EAX The TEST Instruction The TEST instruction works same as the AND operation, but unlike AND instruction, it does not change the first operand. So, if we need to check whether a number in a register is even or odd, we can also do this using the TEST instruction without changing the original number. TEST AL, 01H JZ EVEN_NUMBER The NOT Instruction The NOT instruction implements the bitwise NOT operation. NOT operation reverses the bits in an operand. The operand could be either in a register or in the memory. For example, Operand1: 0101 0011 After NOT -> Operand1: 1010 1100 Print Page Previous Next Advertisements ”;