Delve into the implementation principles of the Linux shell

Updated: 10:17:40 Feb 10, 2024 by Chun Ren.

This article mainly introduces the implementation principle of Linux shell, the article through code examples and graphic introduction is very detailed, for everyone to explore the implementation principle of Linux shell has a certain help, need friends can refer to

catalogue

Vi. Conclusion

1. Print the command prompt

const char* getusername() // Get the USER name {return getenv("USER"); } const char* gethostname() // Get the hostname {return getenv("HOSTNAME"); } const char* getpwd() // Gets the current directory {char* pos = strrchr(getenv("PWD"), '/'); // Find the last '/' if(*(pos+1)! = '\0') return pos+1; // Not the root directory, return to the last folder return pos; } void tooltip() // Print command line PROMPT box {printf(LEFT "%s@%s %s" RIGHT prompt "", getusername(), gethostname(), getpwd()); }

Code analysis: Getting the basic information is essentially by calling the getenv interface to get the value of the corresponding environment variable. Use the strrchr function to find the last file separator/in the current path, which can be a file separator or a root directory, so it is up to you.

Second, read the keyboard input instructions

char command[1024]; // Store keyboard input instruction int getcommand(char* command, int size) // read instruction {memset(command, '\0', size); char* ret = fgets(command, size, stdin); // ret must not be empty, because at least one return is entered, and fgets can read the return assert(ret!). = NULL); (void)ret; // "Pretend to use ret to prevent some compiler warnings" // aaabc\n\0 command[strlen(command)-1] = '\0'; // Remove \n return 1; } int interact(char* command, int size) // interact {tooltip(); while(getcommand(command, size) && (strlen(command) == 0)) { tooltip(); } } int main() { interact(command, sizeof(command)); // Interactive printf("echo: %s\n", command); return 0; }

Code analysis: Keyboard input instruction is essentially a string, here can not use scanf to get the string, because scanf is not read Spaces and carriage returns (encounter Spaces and carriage returns to stop reading), and our general instructions are with options, instructions and options are generally separated by Spaces. Using scanf will cause us to read incomplete instructions. Here the fgets function is used to read the keyboard input, and its first argument is the first address of the space where the instruction is stored; The second parameter is the size of the space; The third parameter is which file stream to read from. A C/C++ program will open three file streams stdin, stdout, stderr by default. Here, read from stdin, that is, read from standard input. The gets function automatically adds \0 to the end for us, and when the number of characters read exceeds the storage capacity, the function automatically puts \0 at the end, so we don't have to worry about reserving space for \0 or adding \0 to the end of the string. Secondly, the function successfully reads the first address of the command, otherwise return NULL, in the current scenario, unless the read error, or at least will read into a \n, generally we input instructions is to hit enter, what instructions do not input also hit enter, so under normal circumstances ret can not be NULL. Also consider deleting the read \n, because we don't need it, we just want the full instruction.

Three, instruction cutting

#define SEPARATOR "" // Instruction separator char* argv[ARGC_LONG] = {NULL}; void commandcut(char* command, char** argv, int argvsize) // Commandcut (char* command, char** argv, int argvsize) // Instruction cut {memset(argv, 0, argvsize); // Clear char cop_command[COMMAND_LONG] = {'\0'}; // Ensure that the command string is not changed. for(int i = 0; command[i] ! = '\0'; i++) { cop_command[i] = command[i]; } // Start cutting substrings char* ret = strtok(cop_command, SEPARATOR); int i = 0; while(ret ! = NULL) { argv[i++] = ret; ret = strtok(NULL, " "); }} int main() {while(1) {// 1, interact command parameter interact(command, sizeof(command)); // Interactive // Here indicates that the command has been obtained, and then the command is broken up // 2, commandcut(command, argv, sizeof(argv)); for(int i = 0; argv[i]; i++) { printf("[%d]: %s\n", i, argv[i]); } printf("echo: %s\n", command); } return 0; }

Code analysis: This step is mainly to use the strtok function to cut the obtained instructions into a substring, and store the starting address of all substrings in argv. Note that the strtok function changes the contents of the original space, so a temporary space cop_command is created.

4. Execution of general orders

void normalcommandexecution(char** _argv, int* _lastcode) // Execution of common commands {pid_t id = fork(); if(id < 0) { perror("fork"); } else if(id == 0) { // child int ret = execvp(_argv[0], _argv); if(ret == -1) { perror("exeecp"); exit(EXIT_CODE); } } else { // father int status; pid_t ret = waitpid(id, &status, 0); // Block wait if(ret == id) {*_lastcode = WEXITSTATUS(status); }}} int main() {while(1) {// 1, interact the command line parameter interact(command, sizeof(command)); // Interactive // Here indicates that the command has been obtained, and then the command is broken up // 2, commandcut(command, argv, sizeof(argv)); // 3. Run normalcommandexecution(argv, &lastcode). } return 0; }

Code analysis: For common commands such as ls (non-built-in commands), first create a child process through fork, and then call the execvp interface to replace the program to execute the input commands.

5. Execution of built-in instructions

5.1 cd Command

bool isnormalcommand(char **_argv) // Command judgment {if (strcmp(_argv[0], "cd") == 0) return false; return true; } void changpwd(char** _argv) // Change the current working directory {chdir(_argv[1]); // Change the current working directory // getpwd(pwd, sizeof(pwd)); sprintf(getenv("PWD"), "%s", getcwd(pwd, sizeof(pwd))); } void builtincommand(char **_argv) // Execute the built-in command {if (strcmp(_argv[0], "cd") == 0) {changpwd(_argv); }} int main() {while (1) {// 1, interact command parameter interact(command, sizeof(command)); // Interactive // Here indicates that the command has been obtained, and then the command is broken up // 2, commandcut(command, argv, sizeof(argv)); Command judgment // 3, common commandexecution if (isnormalcommand(argv)) // Common command normalcommandexecution(argv, &lastcode); else // builtincommand(argv); } return 0; }

Code analysis: To consider the built-in instructions, it is necessary to judge the instructions after the instruction cutting. Built-in instructions do not need to create child processes to execute, but are directly executed by the current bash process. For example, after the cd directive is executed, we want the current bash to change the working directory, rather than having it create a child process to execute the cd directive, which changes the child's working directory. As you can see, after a command is executed, if it will affect bash, it must be a built-in instruction. Second, what about the cd directive, which changes the current working directory? My myshell is an executable program, my source code and compiled executable files are always placed in the /home/wcy-linux-s /2023-10-28a/myshell directory, you cd command with what can change my work record? In fact, it is not, here to change the working directory is: An executable program in the process to produce PCB object, PCB inside the maintenance of a property called the current executable program working directory, cd instruction change is actually this property, not to change the myshell program storage location, we call chdir system call to modify this property. Finally, because we used the environment variable to get the current working directory, and the environment variable will not automatically change after the current myshell process inherits from the parent process, so after executing the cd command, we need to modify the PWD environment variable. An environment variable is essentially a piece of string information stored in memory, so we can use the sprintf function to modify the string information.

5.2 export Instruction

#define USER_ENV_SIZE 100 // Number of environment variables that a user can add #define USER_ENV_LONG 1024 // The maximum length of an environment variable for a user is char userenv[USER_ENV_SIZE][USER_ENV_LONG]; // Save the environment variable added by the user int userenvnum = 0; // Number of environment variables entered by the user void exportcommand(char** _argv, char(*_userenv)[USER_ENV_LONG], int* _userenvnum) {// Store the environment variables entered by the user strcpy(_userenv[*_userenvnum], _argv[1]); int ret = putenv(_userenv[(*_userenvnum)++]); if (ret == 0) perror("putenv"); }

Code analysis: As long as bash does not exit, each time we add environment variables should be saved, we input environment variables are saved as instructions in the command, when the next input instruction, the last input content will be cleared. putenv adds environment variables, and does not copy the corresponding string to the system table, but saves the address of the string in the system table, so we need to ensure that the environment variable in the address where the environment variable string is saved will not be modified, so we need to enter the environment variable for the user. That is, that string is stored in a separate space to ensure that when you re-enter instructions, it will not affect the environment variables previously added by the user. Because the environment variable is essentially a string, we define a two-dimensional array of characters to store the environment variable input by the user, first store the environment variable input by the user into the array we define, and then call the putenv function to add the contents of the array to the current environment variable. This ensures that the environment variables that the user has historically added remain as long as the current bash does not exit. This involves the problem of two-dimensional array parameter passing, again to review, the array name represents the first element address, the first element of a two-dimensional array is a one-dimensional array, so the type of the function parameter is the address of a character one-dimensional array, that is, char(*)[USER_ENV_LONG].

5.3 echo Command

void echocommand(char **_argv, int _argc)
{
    if (_argv[1][0] == '$')
    {
        char *ptr = _argv[1] + 1;
        printf("%s\n", getenv(ptr));
    }
    else
    {
        int i = 1;
        while (i < _argc)
        {
            char *ret = strtok(_argv[i], "\"");
            while (ret != NULL)
            {
                printf("%s", ret);
                ret = strtok(NULL, "\"");
            }
            printf("%c", ' ');
            i++;
        }
        printf("\n");
    }
}

Code analysis: The echo instruction needs to consider removing the input ", followed by the possible continuous input of multiple strings, but also consider that echo and $are used together to print the value of the environment variable.

Summary: When we log in, the system is to start a shell process, our shell's own environment variables are when the user logs in, the shell will read the user directory.bash_profile file, which saves the way to import environment variables.

Vi. Conclusion

The above is to explore the details of the implementation principle of the Linux shell, more information about the implementation principle of the Linux shell, please pay attention to other related articles in the script home!

Articles you may be interested in:

Shell script conditional test and if conditional statement use method

This article mainly introduces the Shell script conditional test and if conditional statement use method, the article through the example code introduction is very detailed, for everyone's study or work has a certain reference learning value, the need of friends below with the small series to learn it
2019-11-11
awk commands, awk programming language details and examples

This article mainly introduces awk commands,awk programming language detailed introduction and examples, explaining such as awk records, fields, separators,awk built-in variables and operators, etc., need friends can refer to the next
2014-07-07
Method of deleting intermediate node given pointer of linked list

This paper realizes the algorithm to delete the middle node in the single linked list, only know the pointer to the middle node of the node, we can refer to use
2013-11-11
Detailed description of Linux file search and decompression commands

This article mainly introduces the Linux file search and compression command, file search includes search according to the name, according to the file belongs to the 'main user' search, this article gives you a very detailed introduction, interested friends follow the small series together to see it
2024-02-02
Linux and Arm-Linux program development notes from the Zero Foundation Introduction

This article mainly introduces the Linux and Arm-Linux program development notes of the zero foundation, and the friends who need it can refer to the next
2015-10-10
kvm installation and snapshot management in linux

This article mainly introduces the kvm installation and snapshot management of linux related information, need friends can refer to the next
2016-12-12
shell Learning tutorial Get command line parameter examples

This article mainly introduces the basic knowledge of shell learning to obtain command line parameter examples, need friends can refer to the next
2014-03-03
shell generates seven methods of random number implementation

This article mainly introduces the realization of seven methods of shell random number generation, the article introduces very detailed through the example code, which has certain reference learning value for everyone's study or work, and the friends who need to learn together with the small series below
2020-12-12
vi and vim editor operations in linux

This article mainly introduces the vi and vim editor operating methods in linux, this article gives you a very detailed introduction, has a certain reference value, need friends can refer to the next
2019-05-05
Method steps for configuring hostname in shell script

This article mainly introduces the shell script configuration of hostname methods and steps, the article through the example code is very detailed, for everyone's study or work has a certain reference learning value, the need of friends below with the small series to learn it
2023-03-03