Delve into the implementation principle of Linux shell _linux shell_Script home

Delve into the implementation principles of the Linux shell

Updated: 10:17:40 Feb 10, 2024 by Chun Ren.
This article mainly introduces the implementation principle of Linux shell, the article through code examples and graphic introduction is very detailed, for everyone to explore the implementation principle of Linux shell has a certain help, need friends can refer to

1. Print the command prompt

const char* getusername() // Get the USER name {return getenv("USER"); } const char* gethostname() // Get the hostname {return getenv("HOSTNAME"); } const char* getpwd() // Gets the current directory {char* pos = strrchr(getenv("PWD"), '/'); // Find the last '/' if(*(pos+1)! = '\0') return pos+1; // Not the root directory, return to the last folder return pos; } void tooltip() // Print command line PROMPT box {printf(LEFT "%s@%s %s" RIGHT prompt "", getusername(), gethostname(), getpwd()); }

Code analysis: Getting the basic information is essentially by calling the getenv interface to get the value of the corresponding environment variable. Use the strrchr function to find the last file separator/in the current path, which can be a file separator or a root directory, so it is up to you.

Second, read the keyboard input instructions

char command[1024]; // Store keyboard input instruction int getcommand(char* command, int size) // read instruction {memset(command, '\0', size); char* ret = fgets(command, size, stdin); // ret must not be empty, because at least one return is entered, and fgets can read the return assert(ret!). = NULL); (void)ret; // "Pretend to use ret to prevent some compiler warnings" // aaabc\n\0 command[strlen(command)-1] = '\0'; // Remove \n return 1; } int interact(char* command, int size) // interact {tooltip(); while(getcommand(command, size) && (strlen(command) == 0)) { tooltip(); } } int main() { interact(command, sizeof(command)); // Interactive printf("echo: %s\n", command); return 0; }

Code analysis: Keyboard input instruction is essentially a string, here can not use scanf to get the string, because scanf is not read Spaces and carriage returns (encounter Spaces and carriage returns to stop reading), and our general instructions are with options, instructions and options are generally separated by Spaces. Using scanf will cause us to read incomplete instructions. Here the fgets function is used to read the keyboard input, and its first argument is the first address of the space where the instruction is stored; The second parameter is the size of the space; The third parameter is which file stream to read from. A C/C++ program will open three file streams stdin, stdout, stderr by default. Here, read from stdin, that is, read from standard input. The gets function automatically adds \0 to the end for us, and when the number of characters read exceeds the storage capacity, the function automatically puts \0 at the end, so we don't have to worry about reserving space for \0 or adding \0 to the end of the string. Secondly, the function successfully reads the first address of the command, otherwise return NULL, in the current scenario, unless the read error, or at least will read into a \n, generally we input instructions is to hit enter, what instructions do not input also hit enter, so under normal circumstances ret can not be NULL. Also consider deleting the read \n, because we don't need it, we just want the full instruction.

Three, instruction cutting

#define SEPARATOR "" // Instruction separator char* argv[ARGC_LONG] = {NULL}; void commandcut(char* command, char** argv, int argvsize) // Commandcut (char* command, char** argv, int argvsize) // Instruction cut {memset(argv, 0, argvsize); // Clear char cop_command[COMMAND_LONG] = {'\0'}; // Ensure that the command string is not changed. for(int i = 0; command[i] ! = '\0'; i++) { cop_command[i] = command[i]; } // Start cutting substrings char* ret = strtok(cop_command, SEPARATOR); int i = 0; while(ret ! = NULL) { argv[i++] = ret; ret = strtok(NULL, " "); }} int main() {while(1) {// 1, interact command parameter interact(command, sizeof(command)); // Interactive // Here indicates that the command has been obtained, and then the command is broken up // 2, commandcut(command, argv, sizeof(argv)); for(int i = 0; argv[i]; i++) { printf("[%d]: %s\n", i, argv[i]); } printf("echo: %s\n", command); } return 0; }

Code analysis: This step is mainly to use the strtok function to cut the obtained instructions into a substring, and store the starting address of all substrings in argv. Note that the strtok function changes the contents of the original space, so a temporary space cop_command is created.

4. Execution of general orders

void normalcommandexecution(char** _argv, int* _lastcode) // Execution of common commands {pid_t id = fork(); if(id < 0) { perror("fork"); } else if(id == 0) { // child int ret = execvp(_argv[0], _argv); if(ret == -1) { perror("exeecp"); exit(EXIT_CODE); } } else { // father int status; pid_t ret = waitpid(id, &status, 0); // Block wait if(ret == id) {*_lastcode = WEXITSTATUS(status); }}} int main() {while(1) {// 1, interact the command line parameter interact(command, sizeof(command)); // Interactive // Here indicates that the command has been obtained, and then the command is broken up // 2, commandcut(command, argv, sizeof(argv)); // 3. Run normalcommandexecution(argv, &lastcode). } return 0; }

Code analysis: For common commands such as ls (non-built-in commands), first create a child process through fork, and then call the execvp interface to replace the program to execute the input commands.

5. Execution of built-in instructions

5.1 cd Command

bool isnormalcommand(char **_argv) // Command judgment {if (strcmp(_argv[0], "cd") == 0) return false; return true; } void changpwd(char** _argv) // Change the current working directory {chdir(_argv[1]); // Change the current working directory // getpwd(pwd, sizeof(pwd)); sprintf(getenv("PWD"), "%s", getcwd(pwd, sizeof(pwd))); } void builtincommand(char **_argv) // Execute the built-in command {if (strcmp(_argv[0], "cd") == 0) {changpwd(_argv); }} int main() {while (1) {// 1, interact command parameter interact(command, sizeof(command)); // Interactive // Here indicates that the command has been obtained, and then the command is broken up // 2, commandcut(command, argv, sizeof(argv)); Command judgment // 3, common commandexecution if (isnormalcommand(argv)) // Common command normalcommandexecution(argv, &lastcode); else // builtincommand(argv); } return 0; }

Code analysis: To consider the built-in instructions, it is necessary to judge the instructions after the instruction cutting. Built-in instructions do not need to create child processes to execute, but are directly executed by the current bash process. For example, after the cd directive is executed, we want the current bash to change the working directory, rather than having it create a child process to execute the cd directive, which changes the child's working directory. As you can see, after a command is executed, if it will affect bash, it must be a built-in instruction. Second, what about the cd directive, which changes the current working directory? My myshell is an executable program, my source code and compiled executable files are always placed in the /home/wcy-linux-s /2023-10-28a/myshell directory, you cd command with what can change my work record? In fact, it is not, here to change the working directory is: An executable program in the process to produce PCB object, PCB inside the maintenance of a property called the current executable program working directory, cd instruction change is actually this property, not to change the myshell program storage location, we call chdir system call to modify this property. Finally, because we used the environment variable to get the current working directory, and the environment variable will not automatically change after the current myshell process inherits from the parent process, so after executing the cd command, we need to modify the PWD environment variable. An environment variable is essentially a piece of string information stored in memory, so we can use the sprintf function to modify the string information.

5.2 export Instruction

#define USER_ENV_SIZE 100 // Number of environment variables that a user can add #define USER_ENV_LONG 1024 // The maximum length of an environment variable for a user is char userenv[USER_ENV_SIZE][USER_ENV_LONG]; // Save the environment variable added by the user int userenvnum = 0; // Number of environment variables entered by the user void exportcommand(char** _argv, char(*_userenv)[USER_ENV_LONG], int* _userenvnum) {// Store the environment variables entered by the user strcpy(_userenv[*_userenvnum], _argv[1]); int ret = putenv(_userenv[(*_userenvnum)++]); if (ret == 0) perror("putenv"); }

Code analysis: As long as bash does not exit, each time we add environment variables should be saved, we input environment variables are saved as instructions in the command, when the next input instruction, the last input content will be cleared. putenv adds environment variables, and does not copy the corresponding string to the system table, but saves the address of the string in the system table, so we need to ensure that the environment variable in the address where the environment variable string is saved will not be modified, so we need to enter the environment variable for the user. That is, that string is stored in a separate space to ensure that when you re-enter instructions, it will not affect the environment variables previously added by the user. Because the environment variable is essentially a string, we define a two-dimensional array of characters to store the environment variable input by the user, first store the environment variable input by the user into the array we define, and then call the putenv function to add the contents of the array to the current environment variable. This ensures that the environment variables that the user has historically added remain as long as the current bash does not exit. This involves the problem of two-dimensional array parameter passing, again to review, the array name represents the first element address, the first element of a two-dimensional array is a one-dimensional array, so the type of the function parameter is the address of a character one-dimensional array, that is, char(*)[USER_ENV_LONG].

5.3 echo Command

void echocommand(char **_argv, int _argc)
{
    if (_argv[1][0] == '$')
    {
        char *ptr = _argv[1] + 1;
        printf("%s\n", getenv(ptr));
    }
    else
    {
        int i = 1;
        while (i < _argc)
        {
            char *ret = strtok(_argv[i], "\"");
            while (ret != NULL)
            {
                printf("%s", ret);
                ret = strtok(NULL, "\"");
            }
            printf("%c", ' ');
            i++;
        }
        printf("\n");
    }
}

Code analysis: The echo instruction needs to consider removing the input ", followed by the possible continuous input of multiple strings, but also consider that echo and $are used together to print the value of the environment variable.

Summary: When we log in, the system is to start a shell process, our shell's own environment variables are when the user logs in, the shell will read the user directory.bash_profile file, which saves the way to import environment variables.

Vi. Conclusion

The above is to explore the details of the implementation principle of the Linux shell, more information about the implementation principle of the Linux shell, please pay attention to other related articles in the script home!

Related article

Latest comments