Commentary on openmosix-git.patch

hunk    class           file patched                            description                                                             noteworthy references
001     i386,config     arch/i386/Kconfig                       this patch adds hpc/Kconfig to the build process
002     i386,remote     arch/i386/kernel/asm-offsets.c          this patch defines offsets to members inside of structures, and defines
        syscall                                                 used by the assembly code in arch/i386/kernel/entry.S. first, we
                                                                generate the offset to the om member of the task structure, from the
                                                                begining of the structure. we then generate an offset to the dflags
                        DDEPUTY and DREMOTE should have a       member, inside of task.om, which is of type openmosix_task. finally,
                        following _asm, indicating they're      we define DDEPUTY and DREMOTE constants, setting them to the DDEPUTY
                        the versions used by assembly code.     and DREMOTE values defined in hpc/task.h.
003     i386,remote     arch/i386/kernel/entry.S                we modify this file to add two new entry points to the kernel, utilize
        remotefork      how do we make these changes            the syscall mapping table in arch/i386/kernel/omasm.h(in both the
        local           conditional? #ifdefs?                   normal int 80h syscall path, and the sysenter path), make the two
        syscall                                                 syscall exit points store a pointer to the thread_info of this process,
                                                                and insert a call to openmosix_pre_usermode in the userspace return
                                                                path. ret_from_deputy_fork is entered by a process, when its eip is set
                                                                to this function by the code we add in the function copy_thread in
                        if this is identical, why use it?       arch/i386/kernel/processes.c. ret_from_deputy_fork is an identical copy
                                                                of ret_from_fork. ret_from_kickstart is called from arch_kickstart in
                        isnt this GET_THREAD_INFO redundant?    hpc/arch-i386.c. it calls GET_THREAD_INFO(%bsp), and jmps to
                                                                syscall_exit, returning to userspace for the 'first' time on a remote
                                                                node. we then modify the resume_userspace entry point, to call
                                                                openmosix_pre_usermode between doing work_pending, and restor_all'ing.
                                                                in the next two hunks, we add code to select which syscall table to use
                                                                based on whether the current task is marked DREMOTE or not. this is
                                                                added once in ENTRY(sysenter_entry), and again in ENTRY(system_call).
                                                                we also modify syscall_exit and sysenter_exit to store the result of
                                                                GET_THREAD_INFO into %ebp, cleaning up after our own code clobbers the
                                                                register.
004     i386,i387       arch/i386/kernel/i387.c                 fxsr support is support for fast saving of the i387's floating
                                                                point/sse/sse2/etc state to a 512 byte block. its a new feature,
                                                                not present in earlier i387 style floating point processors. this patch
                                                                changes from declaring the conversion functions for fxsr<->387 from
                                                                static to OM_NSTATIC, and adds a function for finding out whether
                                                                support exists during run time.
005     i386,remote     creates arch/i386/kernel/omasm.h        this file contains the syscall table called by processes which are
                        whats the rule for whether to process   DREMOTE. it contains a mapping of whether a syscall is to be passed to
                        locally or back home?                   the home node, or handled locally.
                        #define self out if !CONFIG_OPENMOSIX
006     userthread      arch/i386/kernel/process.c              in this patch, we add an entry for user_thread_helper in the
        remotefork                                              kernel_thread_helper execution path, add a function for creating an
                                                                in-kernel user thread, and re-direct the entry point ret_from_fork to
                                                                ret_from_deputy_fork for processes that are DDEPUTY in copythread().
                                                                our user_thread_helper entry point meerly subtracts 60h from the stack
                                                                pointer for this task, reserving space for the user registers on the
                                                                stack, allowing execution to continue into the kernel thread helper.
                                                                user_thread is called by openmosix_mig_daemon, in hpc/migrecv.c, to
                                                                start a 'kernel thread', in the user segment, to handle an incoming
                                                                migration request. to do this, we set up the user registers in a
                                                                pt_regs structure, so that we can call do_fork, and have it create the
                                                                thread for us. first we zero the structure. then we assign the
                                                                function pointer to the function we want to start in to ebx, set edx to
                                                                the function's argument, set xds and eds to allow the process access to
                                                                __USER_DS (the usermode dataspace), and set xcs to allow the process
                                                                to execute code in __KERNEL_CS (the kernels codespace). we set the
                                                                orig_eax 'register' to -1, and set the eip to point to our
                                                                user_thread_helper above, so that execution starts there, and set our
                                                                eflags so that when this process is running, hardware interrupts are    http://x86.org/intel.doc/386manuals.htm
                                                                enabled, and so that the sign and parity bits are turned on. we then
                                                                call do_fork, adding flags indicating that we don't want this process
                                                                to receive SIGCHILD, and that it cannot be ptraced. we return the
                                                                result of the do_fork call. finally, we modify copy_thread so that for
                                                                processes marked DDEPUTY, instead of the parent process returning to
                                                                userspace and immediately entering the kernel at ret_from_fork, we set
                                                                it to enter the kernel at ret_from_deputy_fork.
007     i386,ksocket    arch/i386/kernel/signal.c               this patch changes the do_signal function from static to OM_NSTATIC
008     i386,remote     arch/i386/kernel/sys_i386.c             this patch modifies sys_mmap2 so that processes marked DREMOTE mapping
                                                                memory without MAP_ANONYMOUS get forwarded to remote_do_mmap.
009     i386,local      arch/i386/kernel/vm86.c                 this patch changes the save_v86_state and return_to_32bit functions
        remote                                                  so that they clear the DSTAY_86 flag before they exit. it also changes
                                                                both sys_vm86 and sys_vm86old so that they return a process to its
                                                                home node if it attempts to enter vm86 mode. the code we add to
                                                                save_v86_state and return_to_32bit simply task_lock()s the current
                                                                task, uses task_clear_stay() to clear the DSTAY_86 flag, and
                                                                task_unlock()s the current task. the code we add to sys_vm86 and
                                                                sys_vm86old simply calls task_go_home_for_reason(), specifying
                                                                DSTAY_86. if task_go_home_for_reason returns non-zero, we force the
                        better error message?                   function we're in to return -ENOMEM, as we must be on remote, and
                                                                migration must have failed.
010     i386,remotemem  arch/i386/lib/usercopy.c                this patch changes strlen_user to redirect to deputy_strlen_user if
                                                                openmosix_memory_away().
011     ppc,config      arch/ppc/Kconfig                        this patch adds hpc/Kconfig to the build process
012     ppc,remote      arch/ppc/kernel/asm-offsets.c           this patch defines offsets to members inside of structures, and defines
        syscall                                                 used by the assembly code in arch/ppc/kernel/entry.S. first, we
                                                                generate the offset to the om member of the task structure, from the
                                                                begining of the structure. we then generate an offset to the dflags
                        DDEPUTY and DREMOTE should have a       member, inside of task.om, which is of type openmosix_task. finally,
                        following _asm, indicating they're      we define DDEPUTY and DREMOTE constants, setting them to the DDEPUTY
                        the versions used by assembly code.     and DREMOTE values defined in hpc/task.h.
013     ppc,remote      arch/ppc/kernel/entry.S                 we modify this file to a new entry point to the kernel, utilize the
        syscall                                                 syscall mapping table in arch/ppc/kernel/misc.h, and insert a call to
                                                                openmosix_pre_usermode in our userspace return path. first we modify
                                                                syscall_dotrace_cont, selecting which syscall table to use based on
                                                                whether the current task is marked DREMOTE or not. ret_from_kickstart
                                                                is called from arch_kickstart. ret_from_kickstart branches directly to
                                                                ret_from_syscall, returning to userspace for the 'first' time on a
                                                                remote enode. our last hunk branches directly to openmosix_pre_usermode
                                                                in the restore_user path.
014     ppc,userthread  arch/ppc/kernel/misc.S                  in this patch, we add an assembly function to create usermode threads,
                                                                and create the remote syscall table. first we define SIGCHLD. our next
                                                                hunk creates a user_thread function similar to the one in
                        rewrite in C, if possible!              arch/i386/kernel/process.c, except hand written in assembly. finally,
                        move syscall table to omasm.h!          we create the syscall table used by processes which are DREMOTE. it
                                                                contains a mapping of wether a syscall is to be passed to the home
                                                                node, or handled locally.
015     x86_64,config   arch/x86_64/Kconfig                     this patch adds hpc/Kconfig to build process
016     x86_64,remote   arch/x86_64/kernel/asm-offsets.c        this patch defines offsets to members inside of structures, and defines
        syscall         remove ifdef around header. redundant.  used by the assembly code in arch/x86_64/kerel/entry.S. first, we
                        move this define with the others.       generate the offset to the om member of the task structure, from the
                        add a define around task! is task used? begining of the structure. next we define an entry for task. in our
                                                                last hunk, we generate an offset to the dflags member, inside of
                        DDEPUTY and DREMOTE should have a       task.om. we then define DDEPUTY and DREMOTE, setting them to the
                        following asm!                          DDEPUTY and DREMOTE values defined in hpc/task.h.
017     x86_64,remote   arch/x86_64/kernel/entry.S              we modify this file to add a new entry point for returning from
        syscall         dont define out omasm.h                 kickstart, utilize the syscall mapping table in
                                                                arch/x86_64/kernel/omasm.h, modify the PTREGSCALL macro to create om_
                                                                entries for each of the 6 functions that take a PTREGS argument, insert
                                                                an om_ptregscall_common entry point, insert a om_stub_execve entry
                                                                point, insert a call to openmosix_pre_usermode, and insert a
                        re-write user_thread in C!              user_thread function written in assembly. ret_from_kickstart restores
                                                                the state of the registers to how they were before it was called, and
                                                                returns to userspace for the first time, on a remote node. in the
                                                                system_call entry point, we check for DREMOTE in task.om.dflags.
                        why the fake frame? CFI_ADJUST!         if we find it, we jump over a stack frame, call into our
                        UNFAKE_STACK_FRAME?                     remote_sys_call_table, step back under the stack frame, and jmp to
                                                                ret_from_syscall. otherwise, we pass through to the normal syscall
                                                                handler. next, we re-define the PTREGSCALL macro so that when its used,
                                                                it creates two entries instead of one. one 'normal' entry, and one
                                                                entry prepended with om_, that loads the address of an om_ version of
                                                                the function being declared, and calls our om_ptregscall_common
                                                                entry to dispatch. om_ptregscall_common is our version of
                        non-functional differences?             ptregscall_common. the only functional difference is ours begins by
                                                                jumping over a stack frame, does the same work as ptregscall_common to
                                                                call the C function pointed to in rax, and peel back to before the
                                                                stack frame we jumped over. entry om_stub_execve is similar to
                                                                stub_execve, only call remote_do_execve instead of sys_execve, and we
                                                                save the contents of r11(eflags) in r15 during this call, restoring
                                                                afterwards. our next hunk modifies the common_interrupt entry to call
                                                                our openmosix_pre_usermode function during the return to userspace.
                        move to C?                              finally, we create our user_thread entry, which is responsible for
                                                                creating our 'user thread', similar to the two previous user_thread
                                                                functions.
018     x86_64,remote   creates arch/x86_64/kernel/omasm.h      this is a table containing mappings that are used to dispatch syscall
                        #define self out if !CONFIG_OPENMOSIX   requests made by processes that are guests. it contains mappings of
                                                                whether a syscall is to be passed to the home node, or processed
                        what about sys_ni_syscall?              locally. the entries are referenced by code generated by the
                                                                PTREGSCALL macro in arch/x86_64/kernel/entry.S. each mapping stores the
                                                                address of one of om_sys_local, om_sys_remote, or sys_ni_syscall.
                                                                this address is loaded into %rax by PTREGSCALL, and PTREGSCALL calls
                                                                om_ptregscall_common to dispatch to the function retrieved from this
                                                                table.
019     x86_64,remote   arch/x86_64/kernel/sys_x86_64.c         this patch redirects sys_mmap2 so that remote processes mapping memory
                                                                without MAP_ANONYMOUS get forwarded to remote_do_mmap.
020     x86_64,local    arch/x86_64/lib/copy_user.S             this patch redirects copy_to_user and copy_from_user so when the kernel
        rmem                                                    on the home node is accessing memory in the processes userspace, it
                                                                gets redirected to functions accessing memory on the remote node. our
                                                                first hunk modifies copy_to_user, checking to see if the process is
                                                                marked DDEPUTY, and if so re-directing to deputy_copy_to_user. the
                                                                second hunk accomplishes the same task, re-directing copy_from_user to
                        better label than 2901!                 deputy_copy_from_user if the task is marked DDEPUTY.
021     x86_64,local    arch/x86_64/lib/usercopy.c              this patch forwards __strncpy_from_user and strncpy_from_user so when
        rmem            missing comments on #endifs             the kernel on the home node is accessing memory in a deputy processes
                                                                userspace, we use deputy_strncpy_from_user. we also forward
                                                                __strlen_user and strlen_user to deputy_strlen_user for the same
                                                                reason.
022     local,rmem      fs/namei.c                              modify getname to use deputy_strncpy_from_user to get the filename
                        BUG! this function is supposed to       requested from userspace from the remote node when
                        cannonicalize the filename passed in,   openmosix_memory_away().
                        not just return it!
                        missing comment on #endif
023     local,procfs    fs/proc/base.c                          this patch adds "files" named where, stay, and debug in a directory
                                                                named "hpc" in the /proc/$PID/ directory of each process on the
                        take out #ifdef around header include.  local node. first, we include the hpc/hpc.h header. in the next two
                                                                hunks we add entries for PROC_TGID_OPENMOSIX,
                                                                PROC_TGID_OPENMOSIX_WHERE, PROC_TGID_OPENMOSIX_STAY,
                                                                PROC_TGID_OPENMOSIX_DEBUG, PROC_TID_OPENMOSIX,
                                                                PROC_TID_OPENMOSIX_WHERE, PROC_TID_OPENMOSIX_STAY, and
                                                                PROC_TID_OPENMOSIX_DEBUG into enum pid_directory_inos. this enum sets
                                                                the inode number for each of our "files" in /proc to unique values.
                                                                next we create the "om" entry with inode number PROC_TGID_OPENMOSIX
                                                                in the tgid_base_stuff structure. we then do the same thing again,
                                                                creating a "om" entry with inode number PROC_TID_OPENMOSIX in
                                                                tid_base_stuff. next we create a pair of structures
                                                                (tgid_openmosix_stuff and tid_openmosix_stuff), containing entries for
                                                                PROC_TGID_OPENMOSIX_WHERE/"where", PROC_TID_OPENMOSIX_WHERE/"where",
                                                                PROC_TGID_OPENMOSIX_STAY/"stay", PROC_TID_OPENMOSIX_STAY/"stay",
                                                                PROC_TGID_OPENMOSIX_DEBUG/"debug", and
                                                                PROC_TID_OPENMOSIX_DEBUG/"debug". these entries declare the contents of
                                                                our /proc/$PID/om/ directory. in our next hunk, we add
                                                                proc_pid_openmosix_read and proc_pid_openmosix_write functions, and a
                                                                file_operations structure named proc_pid_openmosix_operations
                                                                mapping .read and .write functions to the proc_pid_openmosix_read and
                                                                proc_pid_openmosix_write we just declared. in proc_pid_openmosix_read,
                                                                we first trunicate a read request to PAGE_SIZE, and request a free page
                                                                in GFP_KERNEL. if __get_free_page fails, we return -ENOMEM. otherwise,
                                                                we then call openmosix_proc_pid_getattr from hpc/proc.c, which does the
                                                                work of dispatching our read request to the right function. it fills in
                                                                the page we allocated, and returns the ammount of characters written to
                                                                the page. if the length is less than zero, this indicates an error. we
                                                                respond to this error by freeing our page, and returning the error
                                                                value. assuming no error occured, we check to see if the user requested
                                                                data beyond the end of what we 'read'. if they did, we free our
                                                                requested page, and return 0. otherwise, we take the seek value (ppos),
                                                                and apply it to our page. we then copy_to_user the contents of our page
                                                                (past the seek) to the passed in userspace buf, free our page, and
                                                                return the number of bytes copy_to_user'd. in proc_pid_openmosix_write,
                        reverse the order of these. fail first! we first trunicate our write request to PAGE_SIZE, then check to see if
                                                                the user requested a 'partial write'. if they did, we return -EINVAL.
                                                                we then get a free page in GFP_USER. if that fails, we return -ENOMEM.
                                                                otherwise, we copy the data in the passed in buf into our new page. if
                                                                our copy_from_user fails, we free our page, and return -EFAULT.
                                                                otherwise, we call openmosix_proc_pid_setattr with our page, free said
                                                                page, and return the length returned(even if its a negative value, EG
                                                                file_operations structure (proc_pid_openmosix_operations), pointing its
                                                                .read and .write members to the two functions we just declared.
                                                                we then forward declare proc_tid_openmosix_operations,
                                                                proc_tid_openmosix_inode_operations, proc_tgid_openmosix_operations,
                                                                and proc_tgid_openmosix_inode_operations structures, which we'll define
                                                                later in this patch. adds a pair of cases to the large switch in
                                                                proc_pident_lookup that map all eight of our unique identifiers defined
                                                                at the begining of this file to their respective file_operations
                                                                structures, and inode operations structures in the case of the
                                                                containing directories. this allows the proc system to find the
                                                                structures containing the function pointers to handle our requests.
                                                                in our last hunk, we provide implementations for the functions
                                                                proc_tgid_openmosix_readdir and proc_tid_openmosix_readdir
                                                                that return the result of calling proc_pident_readdir against our
                                                                tgid_openmosix_stuff and tid_openmosix_stuff structures.
                                                                we then define the proc_tgid_openmosix_operations and
                                                                proc_tid_openmosix_operations structures, mapping .read to
                                                                generic_read_dir and .readdir to our proc_tgid_openmosix_readdir or
                                                                proc_tid_openmosix_readdir declared above. we then define the functions
                                                                proc_tgid_openmosix_lookup and proc_tid_openmosix_lookup, which return
                                                                the result of calling proc_pident_lookup with our tgid_openmosix_stuff
                                                                or tid_openmosix_stuff structures. finally, we define our
                                                                proc_tgid_openmosix_inode_operations and
                                                                proc_tid_openmosix_inode_operations structures, mapping .lookup to
                                                                proc_tgid_openmosix_lookup or proc_tid_openmosix_lookup.
024     local,procfs    fs/proc/root.c                          this patch changes proc_root_init to call openmosix_proc_init. first,
                        remove #ifdef around include            we include our hpc/hpc.h header. then, we add our call to
                                                                openmosix_proc_init (from hpc/proc.c) into proc_root_init.
025     ksocket,local,  fs/select.c                             this patch exports the do_select function, so that hpc/kcomd.c can use
        remote                                                  it to check for data on our pile of incoming sockets. this function is
                                                                is only exported if CONFIG_KCOMD is selected.
026     i386, local     creates hpc/arch-i386.c                 this file contains functions converting between two different but        http://arch.ece.uic.edu/~yxshi/param/web/homepage/research/doc/reference/vc130.htm
        remote, syscall this file should be broken up.          compatible x87 floating point state formats, for sending and receiving
        archmig                                                 archetecture specific sections of a given task's state, a function for
                                                                starting a new guest process, and support functions for handling
                                                                syscall requests from entry.S. first, we forward declare the functions
                                                                twd_fxsr_to_i387 and twd_i387_to_fxsr from arch/i386/kernel/i387.c.
                                                                we utilize them in creating fxsave_to_fsave and fsave_to_fxsave
                        make the order of operations in these   functions. fxsave_to_fsave is called by the later declared
                        two functions identical.                arch_mig_receive_fp to convert from fxsave to fsave format. we start
                                                                by copying the contents of the cwd, swd, fip, fcs, foo, and fos fields
                                                                of the from union to the to union. we then use twd_fxsr_to_i387() to
                                                                fill in our twd member. we then save padding[0] to our fop member, and
                                                                padding[1] to our mxcsr mrmber. next we perform a memcopy loop to copy
                                                                and convert the st_space member. this member contains fields that are
                                                                16 bytes long in fxsave format, and 10 bytes long in fsave format. we
                                                                loop through the fields, copying only the first 10 bytes to our to's
                                                                st_space. finally, we memcopy the xmm_space member. fsave_to_fxsave is
                                                                also called by arch_mig_receive_fp, to convert from fsave to fxsave
                                                                format. we start by copying the contents of the cwd, swd, fip, fcs,
                                                                foo, and fos to the 'to' union, from the 'from' union. we use
                                                                twd_i387_to_fxsr() to fill in our twd member, save padding[0] to our
                                                                fop member, and save padding[1] to our mxcsr member. after that, we
                                                                enter a loop, memcopying our 10 byte long members of st_space to 16
                                                                byte spaces. finally, we memcopy the xmm_space member. the next three
                                                                functions are for receiving archetecture specific state information.
                        BROKEN! does not handle setting up      arch_mig_receive_specific is called by mig_do_receive in
                        LDT entries!                            hpc/migrecv.c. its purpose is to receive the archetecture specific part
                                                                of a process. the one in this file has code to warn us that we're not
                                                                setting up the LDT correctly, and still returns success. if its asked
                                                                to setup anything else, we return -1. arch_mig_receive_proc_context is
                                                                called at the top of mig_do_receive_proc_context, from hpc/migrecv.c.
                                                                its function is to set up the CPU state from the passed in omp_mig_task
                                                                structure. we start by getting the pt_regs structure of the task we're
                        check failure in this function!         setting up with ARCH_TAK_GET_USER_REGS. we then overwrite it with
                                                                omp_mig_task's regs member. we overwrite our task's thread.debugreg
                                                                with arch.debugreg from omp_mig_task, as well as overwriting thread.fs
                                                                and thread.gs with arch.fs and arch.gs (setting up our segmentation
                                                                registers). we then copy the contents of the tls_array structure, which http://lwn.net/Articles/5851/
                                                                contains the 'thread local space' segment offsets. this function always
                                                                returns 0. arch_mig_receive_fp is called by mig_do_recieve_fp, from
                                                                hpc/migrecv.c. its function is to set up the FPU state from the passed
                                                                in omp_mig_fp structure. we start by calling unlazy_fpu, to initialize
                                                                the FPU, then we check wether the current CPU has the fsxr instruction,
                                                                and whether the remote CPU has the fsxr instruction. if they both do,
                                                                or if they both don't, that means the floating point save is in the
                                                                same format, so we just memcpy the state from the omp_mig_fp struct to
                                                                the task's thread.i387 structure. otherwise, we call one of the above
                                                                two conversion functions (fxsave_to_fsave, or fsave_to_fxsave) to
                                                                perform the copy, while translating the formats. the next two functions
                                                                are called by mig_do_send in hpc/migsend.c, before and after doing the
                                                                actual work of sending a task to another node (home or remote).
                                                                arch_mig_send_pre clears the LDT if there is one set for this process,
                                                                and arch_mig_send_post loads the LDT back up, if there is one.
                                                                the next three functions are the send side, to match the three
                                                                arch_mig_recieve functions earlier. all three of these functions are
                                                                called from mig_do_send, in hpc/migsend.c. arch_mig_send_specific is a
                        STUB!                                   stub that looks like it was supposed to send the LDT, but instead
                                                                prints a warning if an LDT is being used. arch_mig_send_fp is called
                                                                to send the FPU state. we call unlazy_fpu, then fill in the fp
                                                                state(along with the fxsr flag). arch_mig_send_proc_context is called
                                                                to send the CPU context of a task. in it, we store the user registers,
                                                                segmentation registers (FS and GS), the thread local space entries, and
                                                                the debugreg registers to the passed in struct omp_mig_task. in
                                                                addition, if this task is marked DDEPUTY (meaning we're on the home
                                                                node), we also send the features of the boot CPU. arch_kickstart is the
                                                                function called to start up a newly "created" task. in it, we set up
                                                                debug registers 0-3, 6, and 7 with set_debugreg(). we intentionally
                                                                omit registers 4 and 5 due to them being just aliases for 6 and 7. we
                                                                use load_TLS, sets up the thread local spaces, and use loadsegment to
                        do we need to flush pending signals?    load our FS and GS registers. we set CS to __USER_CS, flush pending
                                                                signals, and execute an assembly fragment that causes us to immediately
                                                                jump to the ret_from_kickstart entry point in entry.S. at this point
                        split this back off.                    there's a break in the file, like this section used to be another file.
                                                                we include some headers, then define three functions that are part of
                                                                our syscall handling subsystem. arch_exec_syscall is called by
                                                                deputy_do_syscall, to call a requested syscall on behalf of a remote
                                                                process. we use OMDEBUG_SYS to print a tracing message, look up the
                                                                requested syscall in the sys_call_table, and return the result of
                                                                calling it (through a function pointer) with the passed in arguments.
                        these functions belong in the same      the next two functions are called via the remote_sys_call_table in
                        place as user_thread!                   /arch/i386/kernelomasm.h, by guest processes. om_sys_fork is called by
                                                                a guest process, trying to fork. we just wrap remote_do_fork, passing
                                                                it a clone_flag of SIGCHLD, and null arguments for parent and child
                                                                thread pointers. om_sys_clone performs similarly, first checking for a
                                                                new stack pointer in CX. if there isn't one, we re-use the current
                                                                task's stack pointer. we accept the clone_flags in register ebx, the
                                                                parent tidptr in edx, and the child tidptr in edi. we pass all of this
                                                                to the same remote_do_fork as the previous function.
027     ppc, local,     creates hpc/arch-ppc.c                  this patch is very similar to the previous patch, but cleaner.
        remote,                                                 arch_mig_receive_specific just returns 0. its called by mig_do_receive
        arch_mig,                                               from hpc/migrecv.c. its purpose is to receive the archetecture specific
        syscall                                                 part of a process, which aparently the PPC dosent have.
                                                                arch_mig_receive_proc_context is called at the top of
                                                                mig_do_receive_proc_context, from hpc/migrecv.c. its purpose is to set
                                                                the user registers of the current task to the contents of the passed in
                                                                omp_mig_task structure. in it, we simply use ARCH_TASK_GET_USER_REGS to
                        check return of memcpy!                 retreive the registers in question, then memcpy over them from our
                                                                passed in structure. we always return 0. arch_mig_receive_fp is called
                                                                by mig_do_receive_fp, from hpc/migrecv.c. its function is to set up the
                                                                current task's FPU state to the one passed in the omp_mig_fp structure.
                                                                in it, we memcopy the floating point registers from the passed in
                        FIXME: fpscr_pad not needed?            structure over the task->thread->fpr, and copy the fpscr and fpscr_pad
                                                                as well. arch_mig_send_pre and arch_mig_send_post are void no-ops.
                                                                their purpose is to make a process "ready to be migrated" while we're
                                                                pulling the process apart, which aparently dosent need done on PPC.
                                                                they're called at the begining and end of mig_do_send in hpc/migsend.c,
                                                                respectively. arch_mig_send_specific is also a no-op, as the PPC has no
                                                                architecture specific "parts" of a process. in it, we just return 0.
                                                                arch_mig_send_fp is called by mig_do_send to fill the passed in
                                                                omp_mig_fp structure with the floating point state of the current
                        check this memcpy!                      task. in it, we memcopy the task->thread->fpr structure into the
                        FIXME: fpscr_pad not needed?            omp_mig_fp, and set the fpscr and fpscr_pad members as well. we always
                                                                return 0. arch_mig_send_proc_context is called by mig_do_send to fill
                                                                in the passed in omp_mig_task structure with the CPU state of the
                                                                current task. in it, we use ARCH_TASK_GET_USER_REGS to get the pt_regs
                        check this memcpy!                      structure, and just memcpy it into our omp_mig_task structure. we
                                                                return 0. arch_kickstart is called by mig_handle_migration to start a
                                                                guest process for the first time. to accomplish this, we get the user
                        what are we doing with mr 1, or the     registers, and branch to ret_from_kickstart, passing our user registers
                        user registers?                         as input. arch_exec_syscall is called by deputy_do_syscall to call a
                                                                requested syscall on behalf of a remote process, returning its result.
                                                                we look up the requested syscall in the sys_call_table, and return the
                                                                result of calling it.
028     x86_64          creates hpc/arch-x86_64.c               similar to the previous file.                                           <asm/uaccess.h>
                                                                arch_mig_receive_specific is a stub, returning 0.                       <linux/kernel.h>
                                                                arch_receive_proc_context copies from our task structure to our         <linux/kallsyms.h>
                        none of these functions return failure. omp_mig_task structure the user registers, ds, es, fs, gs, fsindex,     <linux/sched.h>
                        why do they not return void?            gsindex, then uses write_pda to set gs to point to the per-processor    <hpc/debug.h>
                                                                datastructure. arch_mig_receive_fp calls unlazy_fpu, then memcopies the <asm/ptrace.h>
                                                                thread.i387 datastructure. arch_mig_send_pre clears the LDT,            <asm/desc.h>
                                                                arch_mig_send_post loads the LDT (if we have one in our context).       <asm/i387.h>
                                                                arch_mig_send_specific is a stub, returning 0.                          <hpc/protocol.h>
                                                                arch_mig_send_proc_context copies from our omp_mig_task structure to    <hpc/arch.h>
                                                                our taso structure the user registers, ds, es, fs, gs, fsindex,         <hpc/task.h>
                                                                gsindex, then uses read_pda to get the pointer to our per-processor     <hpc/syscalls.h>
                                                                datastructure out of the gs register. arch_kickstart sets debugging     <hpc/prototype.h>
                                                                registers 0-3,6,7, loads up the segmentation registers, flushes
                                                                pending signals, and jmps to ret_from_kickstart. arch_exec_syscall just
                                                                calls a given syscall returning the results the syscall returned.
                                                                asmlinkage om_sys_fork calls remote_do_fork. om_sys_iopl, om_sys_vfork,
                                                                om_sys_clone, om_sys_rt_sigsuspend, and om_sys_signalstack are declared
                                                                as unimplimented functions, printing an error and returning -1 when
                                                                called.
029     kcom            creates hpc/comm.c                      this is the kernel-to-kernel communication system. we use tcp/ip        <linux/sched.h>
                                                                sockets to pass information back and forth between kernels.             <linux/socket.h>
                        sock and sk need sync'd                 first, we define three timeout variables (conn_remote_timeo,            <linux/in.h>
                                                                comm_connect_timeo, and comm_reconn_timeo), which are initialized from  <linux/in6.h>
                        bad comments                            values #defined elsewhere. comm_shutdown, is a wrapper to safely call   <linux/net.h>
                                                                sock->ops->shutdown. comm_getname is a wrapper to safely call           <hpc/mig.h>
                                                                sock->ops->getname. it returns -1 if somethings null that shouldnt be,  <hpc/debug.h>
                                                                or if getname returns null. comm_data_ready is a wrapper which calls    <hpc/comm.h>
                                                                wake_up_interruptable to wake up task(s) in the sockets sleeping task   <hpc/task.h>
                        who else needs to do this?              queue. comm_setup_tcp first saves our current address space limit,      http://mail.nl.linux.org/kernelnewbies/2001-11/msg00204.html
                                                                turns on kernel address space, uses sock_setsockopt
                        should this be comm_wrappers.c?         to set SO_KEEPALIVE, then uses sock->ops->setsockopt to set
                                                                TCP_KEEPINTVL TCP_KEEPCNT, TCP_KEEPIDLE, and TCP_NODELAY. it restores
                                                                our origional address space limit, and exits. comm_socket is a wrapper
                                                                around sock_create, returning NULL on error. comm_bind is a wrapper
                        missing checks!                         around sock->ops->bind that logs an error via printk, comm_listen is a
                        missing checks!                         wrapper around sock->ops->listen. comm_connect connects to a remote
                        missing checks!                         kernel via the passed socket, to the passed address. it adds the
                                                                current process to the socket's sleeping task queue, and asynchronously
                                                                asks for the connection to be established. we enter a loop, marking
                                                                the current process TASK_INTERRUPTIBLE, requesting connection
                                                                establishment asynchronously, then uses schedule_timeout to go away.
                                                                when the connection succeeds, we leave the loop, mark the current
                                                                process TASK_RUNNING, and return 0. comm_close is a wrapper around
                                                                sock_release. comm_peek returns wether a socket has data pending.
                        sighfile needs more docs.               comm_poll waits on an "event" to occur on a socket via poll(), until
                                                                the passed timeout period, or MAX_SCHEDULE_TIMEOUT. it uses a similar
                                                                method as the earlier comm_connect, only we use poll() to see if there
                                                                is any data waiting for us on the socket. if there is, return 1,
                        comm_wait should be a define?           otherwise we return 0 when we hit our timeout period. comm_wait is a
                                                                wrapper around comm_poll, filling in some default parameters. com_accept
                                                                receives a passed socket thats been connected to, creates a new socket,
                                                                and uses it to accept a connection from a remote kernel. once comm_poll
                                                                indicates theres data on the passed socket, we use comm_setup_tcp
                                                                to set the connection options on the new socket. if that succeeds, we
                                                                return 0. otherwise, we destroy our newly created socket, and return
                                                                the relevant error. comm_dorecv wraps the sock_recvmsg api to
                        s/lenght/length                         read a given ammount of data from a socket. comm_recv wraps
                                                                comm_dorecv, but also uses the address space change trick of earlier to
                                                                jump into KERNEL_DS, and in case of short read we OMBUG(), then call
                                                                comm_shutdown(link), returning the error from comm_dorecv. comm_send
                        when should we printk,                  uses the address space change, then wraps sock_sendmsg(). in case of
                        when should we OMBUG()?                 short send, it printks and just returns the error. next is a
                                                                "openmosix specifics start here" marker in the comments.                hpc/protocol.h
                                                                set_our_addr sets up the passed sockaddr structure with its default
                                                                family, INADDR_ANY, and the passed port. comm_setup_listen uses
                                                                comm_socket, comm_bind, and comm_listen to set up a listening socket.
                                                                comm_setup_connect opens a connection to a target
                                                                kernel using comm_socket, then comm_connect. comm_send_hd sends a data
                                                                segment, with a omp_req header, then the data itsself.
                                                                finally, comm_send_req sends a omp_req structure, containing only the
                                                                type, no data.
030     rmem            creates hpc/copyuser.c                  this file contains routines for moving chunks of memory over an         <linux/sched.h>
                                                                established connection. its broken into two parts, deputy_* functions,  <hpc/protocol.h>
                                                                and remote_* functions. deputy_ functions are run on the home node, and <hpc/debug.h>
                                                                remote_ functions run on the node a process has been migrated to.       <hpc/prototype.h>
                                                                deputy_copy_from_user requests a given memory segment from the remote   <hpc/hpc.h>
                                                                node. it uses comm_send_hd to send the address to read and the size to
                        OMDEBUG_CPYUSER() is being used in the  read to the remote host. it then uses comm_recv to recv the results
                        deputy code to printk with a unique     directly to the passed destination. its symbol is exported via
                        format?                                 EXPORT_SYMBOL(). deputy_strncpy_from_user requests a given memory
                                                                segment from the remote node, and should be merged with the previous
                                                                function. it uses comm_send_hd to send the address to read and the size
                                                                to read to the remote host. it then uses comm_recv to recv the results
                                                                directly to the passed destination. its symbol is not exported.
                                                                deputy_copy_to_user functions similarly, using comm_send_hd to send the
                                                                address to write and the size, then comm_send to send the data to be
                                                                written to the remote node. its symbol is EXPORT_SYMBOL'd.
                                                                deputy_strnlen_user sends the address and length via comm_send_hd, then
                                                                uses comm_recv to get the result from the remote node. its symbol is
                                                                EXPORT_SYMBOL'd. deputy_put_userX writes a value of 64bits or less
                                                                using a single call to comm_send_hd. its symbol is not exported.
                                                                deputy_put_user puts a long to remote by calling deputy_put_userX.
                                                                its symbol is EXPORT_SYMBOL'd. if BITS_PER_LONG < 64, we create a
                                                                deputy_put_user64 that uses deputy_put_userX to put a up to 64 bit
                                                                value to remote, and EXPORT_SYMBOL it. deputy_get_userX gets a 64 bit
                                                                or less value from remote using comm_send_hd, then comm_recv. its
                                                                symbol is not exported. deputy_get_user wraps deputy_get_userX,
                                                                warning us if its asked for something greater than sizeof(long). its
                                                                symbol is EXPORT_SYMBOL'd. if BITS_PER_LONG < 64, we create a
                                                                deputy_get_user64 that uses deputy_get_userX to get a 64 bit value
                                                                from the remote node. its symbol is EXPORT_SYMBOL'd. at this point, we
                                                                start into code running on the remote node, responding to the above
                                                                sections of code. remote_copy_user handles requests from d
                                                                eputy_copy_to_user and deputy_copy_from_user. its symbol is not
                                                                exported.  remote_strncpy_from_user performs strncpy_from_user on
                                                                behalf of deputy_strncpy_from_user. it uses comm_recv to get its
                                                                target, and comm_send to return the results. its symbol is not
                                                                exported. remote_strnlen_from_user performs strnlen_user or strlen_user
                                                                on behalf of deputy_strnlen_user. it works similarly to
                                                                remote_strncpy_from_user. its symbol is not exported. remote_put_user
                                                                will use put_user on behalf of the home node in up to a 64bit size. its
                                                                missing BITS_PER_LONG logic that should be like the following function.
                                                                its symbol is not exported. remote_get_user is structured similarly.
                                                                its got BITS_PER_LONG==64 logic. its symbol is not exported. finally,
                                                                we have remote_handle_user, which is the function that dispatches up to
                                                                above remote_ functions. it calls com_recv looking for a req structure.
                                                                other than that, its a large select case. we return from it when we
                                                                receive a endtype packet, returning 0. if theres an unrecognised
                                                                packet, we call remote_disappear to die.
031     omctrlfs        creates hpc/ctrlfs.c                    omctrlfs is the future filesystem for performing migration and          http://osdir.com/ml/linux.cluster.openmosix.devel/2006-01/msg00028.html
                                                                remote process state monitoring. this file is a stub of support for     <linux/config.h>
                                                                this filesystem type. CTRLFS_MAGIC is the magic string at the begining  <linux/module.h>
                                                                of the FS for the filesystem layer to recognise this FS type.           <linux/fs.h>
                                                                ctrlfs_fill_super wraps simple_fill_super(), passing it our             <linux/mount.h>
                                                                CTRLFS_MAGIC, and our empty list of files.  ctrlfs_get_sb wraps
                                                                get_sb_single(), telling it to use ctrlfs_fill_super to generate our
                                                                filesystem's superblock. we then have a file_system_type structure,
                                                                mapping .get_sb to our ctrlfs_get_sb, and .kill_sb to a generic cleanup
                                                                function. om_ctrlfs_init is called from the kernel to init the
                                                                module. it calls register_filesystem() with the previously defined
                                                                file_system_type structure. om_ctrlfs_exit is called previous to
                                                                removing the module. it calls simple_release_fs(), then
                                                                unregister_filesystem(). we then define the init and exit points for
                                                                the module, register the license and the author.
032     debug           creates hpc/debug.c                     this file contains debugging assisting code. it starts with debug_mlink <asm/uaccess.h>
                                                                which is a wrapper which printks the address of a socket.               <linux/kallsyms.h>
                                                                debug_page creates a checksum of a 4096 byte page of memory, and        <linux/sched.h>
                        check incoming pointers!                printks the results. debug_vmas dumps the starting address and ending   <linux/config.h>
                                                                address of each vma belonging to a given mm_struct. debug_signals is a  <hpc/debug.h> <hpc/protocol.h> <hpc/comm.h>
                                                                stub, not printking anything of value.
033     debugfs         creates hpc/debugfs.c                   this file contains the debugfs module. it starts with a dentry          <hpc/hpc.h>
                                                                structure for the om/ debugfs directory itsself, then we define four
                                                                file entries, pointing the migration, syscall, rinode, and copyuser
                        move om_opts here?                      files to entries the om_opts structure (defined in hpc/kernel.c),
                        we don't seem to be using these         and an array of dentry structures for the four files. om_debugfs_init
                        debug values anywhere else, what is     is called to initialize the module. it calls debugfs_create_dir to
                        the use of this code?                   create the om debugfs directory, then debugfs_create_u8 to create
                                                                entries to our four files in the directory. om_debugfs_exit is called
                                                                previous to removing this module. it uses debugfs_remove to destroy the
                                                                entries for the four files, then the directory itsself. we then have
                                                                code defining the entry and exit points of the module, the license,
                                                                and the author.
034     i386,arch-debug creates hpc/debug-i386.c                archetecture specific debugging code, i386 version. om_debug_regs dumps <asm/uaccess.h> <linux/kallsyms.h> <linux/sched.h>
                                                                the user register set of the passed in, or current process otherwise    <hpc/debug.h> <asm/ptrace.h> <asm/desc.h>
                        remove one uaccess.h include.           known as the pt_regs structure. if no pt_regs structure is passed in,   <asm/i387.h> <asm/uaccess.h> <asm/ptrace.h>
                        remove one ptrace.h include.            we use ARCH_TASK_GET_USER_REGS to retreive the structure from the       <hpc/protocol.h> <hpc/arch.h> <hpc/task.h>
                                                                current process. debug_thread dumps thread related registers.
                                                                show_user_registers is shamelessly stolen according to the comments,
                                                                and does a much better job of dumping the full state of a user process
                                                                than om_debug_regs, including code pointer, stack pointer.. lots of
                                                                debugging.
035     ppc,arch-debug  creates hpc/debug-ppc.c                 archetecture specific debugging code, ppc version. om_debug_regs dumps  <asm/uaccess.h> <linux/kallsyms.h> <linux/sched.h>
                                                                the pt_regs structure passed in, or if NULL is passed in, the pt_regs   <hpc/debug.h> <asm/ptrace.h> <asm/uaccess.h>
                        remove one uaccess.h, one ptrace.h      structure of the current process. debug_thread and show_user_registers  <asm/ptrace.h> <asm/processor.h> <hpc/protocol.h>
                                                                are stubs, doing nothing and returning nothing.                         <hpc/arch.h>
036     x86_64,         creates hpc/debug-x86_64.c              archetecture specific debugging code, x86_64 version. om_debug_regs     <asm/uaccess.h> <linux/kallsyms.h> <linux/sched.h>
        arch-debug                                              dumps the pt_regs structure passed in, or if NULL, of the current       <hpc/debug.h> <asm/ptrace.h> <asm/desc.h>
                                                                process. debug_thread and show_user_registers are stubs, doing nothing  <asm/i387.h> <asm/uaccess.h> <asm/ptrace.h>
                                                                and returning nothing.                                                  <hpc/protocol.h> <hpc/arch.h> <hpc/task.h>

037     omremote        creates hpc/deputy.c                    deputy.c contains functions for servicing requests from a remote        <linux/sched.h>
                                                                process, AKA, communication to the home node, from a process that is a  <linux/signal.h>
                                                                guest on a remote node. first, theres deputy_die_on_communication,      <linux/file.h>
                        rename deputy_die_on_communication to   which in spite of its name is called by deputy_process_communication to <linux/mount.h>
                        deputy_die                              kill the deputy when communication with the remote node containing the  <linux/acct.h>
                                                                remote half of the process fails. it printk's a message, then calls     <asm/mmu_context.h>
                                                                do_exit(SIGKILL). deputy_do_syscall receives a syscall request from the <hpc/comm.h>
                                                                remote process and executes it, returning the result. deputy_do_fork    <hpc/task.h>
                                                                processes a fork on behalf of the remote process. it opens up a new     <hpc/mig.h>
                                                                connection to the remote node, calls do_fork, then uses task_set_comm   <hpc/arch.h>
                                                                to make the child process communicate over the newly created            <hpc/syscalls.h>
                                                                connection. deputy_do_readpage uses task_heldfiles_find to find a       <hpc/debug.h>
                                                                given file owned by the current deputy process, maps a single page into <hpc/prototype.h>
                                                                memory, sends the contents to the remote node, then unmaps the page.
                                                                deputy_do_mmap_pgoff is called by do_mmap_pgoff in mm/mmap.c to perform
                                                                the same function as do_mmap_pgoff's lower half, with differences for
                                                                deputy processes. to accomplish this, we allocate memory for a vma
                                                                structure from SLAB_KERNEL and zero it. we set up a vma structure in
                                                                this memory coresponding to the memory area we've been requested to
                                                                occupy, and pass it to our passed file *'s mmap f_op handler. we then
                                                                add this file to our held files for this process by calling
                                                                task_heldfiles_add. theres a comment here indicating that we're
                        fill in missing code!                   supposed to insert the vma into our current->??, but the code for that
                                                                isn't yet written. deputy_do_mmap is called from
                                                                deputy_process_communication below. it uses do_mmap_pgoff in mm/mmap.c
                                                                to mmap a file into the deputy, and returns the mmapped region's
                                                                contents to the remote host. bprm_drop is used by the later declared
                                                                __deputy_do_execve to destroy a linux_binprm structure, which is an
                                                                executable program and arguments, destroying its pages, its security    http://www.kernel-api.org/docs/online/1.0/da/d1e/structlinux_\
_binprm.html
                                                                context, mm structure, and calling fput() on all its writable files.
                                                                __deputy_do_execve uses search_binary_handler to attempt execve on
                                                                the home node. if it was successful, we have a FIXME indicating we
                                                                should be freeing the pages containing our arguments.
                                                                we then free bprm our security context, call acct_update_integrals
                                                                (to tell the accounting system about the new process), free the
                                                                bprm structure, and "return" to the new process. otherwise, we use the
                                                                above bprm_drop to clean up the failed execve attempt.
                                                                deputy_setup_bprm is used by the below deputy_do_execve to setup a bprm
                                                                structure suitable for execution by __deputy_do_execve. we allocate
                                                                space for our bprm structure from GFP_KERNEL. we use open_exec to
                                                                attempt to open our executable. if that succeeds, we fill in the bprm's
                                                                file, filename, interp, and mm members, using mm_alloc to fill in
                                                                mm. we use init_new_context to accomplish any archeteture specific
                                                                requirements. on x86, this function copies the local descriptor table
                                                                of the current process to the new process, assuming it has been
                                                                customized. we copy argc and envc, making sure neither is less than
                                                                zero. we allocates a security context, then use prepare_binprm to fill
                                                                in the rest of the bprm structure. we use copy_strings_kernel to copy
                                                                our filename, our safely copy our filename, argv, and envp array
                                                                into kernel pages, instead of user space memory. if any of the above
                                                                fails, we use bprm_drop to clean up in case of error. deputy_do_execve
                                                                processes an execve request from the remote process to execve a new
                                                                executable. it calls comm_recv to receive the requested file, argv, and
                                                                envp, deputy_setup_bprm to get a brpm structure ready to execute, then
                                                                __deputy_do_execve to perform the work. we then use comm_send_hd to
                                                                send back an empty reply. if any of the above fails, we call bprm_drop
                                                                to destroy our bprm structure. deputy_do_sigpending is a wrapper around
                                                                do_signal. it has code for doing more, but its dead/unused code.
                                                                deputy_process_misc checks for pending dreqs, and dispatches them to
                                                                task_do_request. it then checks for pending signals, and dispatches
                                                                them to deputy_do_sigpending. its called by deputy_main between
                                                                communication events. deputy_process_communication contains the switch
                                                                case that calls the aforementioned functions. it calls
                                                                deputy_die_on_communication if comm_recv returns an error, if the type
                                                                member of the req received is zero, or if one of the functions we call
                                                                returns negative. deputy_main_loop is the userspace loop that is
                                                                executed on the home node when a process has gone remote. it calls
                                                                deputy_process_communication when comm_wait returns true. it then calls
                                                                deputy_process_misc to accomplish dispatching of events. deputy_startup
                                                                uses task_set_dflags to mark this task as deputy, flushes a signal that
                                                                pops up for unknown reason, according to a fixme, and calls exit_mm,
                                                                which is a forward declare from kernel/exit.c.
038     omremotefile    creates hpc/files.c                     this file contains routines for handling file access on the home node   <linux/fs.h>
                                                                for processes that are running on a remote node. it starts by declaring <linux/list.h>
                        move remote_aops inside of              two structures. remote_aops is an address_space_operations structure,   <linux/sched.h>
                        rdentry_create_entry                    mapping .readpage to remote_readpage, and not touching any other        <linux/file.h>
                                                                mappings. the second structure is remote_file_operations, mapping .mmap <linux/pagemap.h>
                                                                to remote_file_mmap, and not touching any other mappings.               <linux/mm.h>
                                                                task_heldfiles_add is called by deputy_do_mmap_pgoff in hpc/deputy.c,   <hpc/comm.h>
                                                                to create and insert a om_held_file structure representing a file into  <hpc/prototype.h>
                                                                our linked list of held files. it allocates the om_held_file struct     <hpc/debug.h>
                                                                from GFP_KERNEL, uses get_file to increment the file usage counter,     http://www.faqs.org/docs/kernel_2_4/lki-3.html
                        remove nb member?                       fills in the om_held_file's file and nb entries with our passed file
                                                                pointer, fills in rfile->nopage with nopage from the
                                                                vm_operations_struct passed in, and inserts our om_held_file struct
                                                                into task->om.rfiles with list_add. we then return 0, since get_file
                                                                and list_add can't return errors. task_heldfiles_clear is called by
                                                                openmosix_task_exit to destroy the linked list containing all the files
                                                                held by the process. for each file in the list, it calls fput to
                                                                decrement the file usage counter, then frees the om_held_file
                                                                structure. task_heldfiles_find searches the list of heldfiles for a
                                                                om_held_file whos file member matches the passed in file pointer. it
                                                                uses list_for_each_entry to iterate over items. if it finds a match,
                                                                we return the heldfile, otherwise, we printk() an error, and return
                                                                NULL. next we have a structure declaration that has been commented out
                                                                with a #if 0 block. it was to declare a backing_dev_info structure.
                                                                after that, theres a break in the file, indicating the rest of the file
                        why is rfiles in the task structure,    is different from the above. this section starts by defining the
                        and dentries are stored globally?       om_remote_dentry structure, then defining a spinlock, and a list_head
                                                                for containing remote dentries. rdentry_delete aquires the
                        remove dead code.                       remote_dentries spinlock, and removes the first entry in the list with
                                                                a dentry member that matches the passed in dentry. if it dosent find a
                                                                matching entry, it calls BUG(), and returns -ENOENT. rdentry_iput frees
                                                                a passed inode's generic_ip member (which contains our rfile_inode_data
                                                                structure), then calls iput to both push an inode's contents to disk,
                                                                and decrement its usage counter. the struct remote_dentry_ops maps
                                                                its .d_delete and .d_put entries to the previous two functions. the
                                                                previous two functions, and this structure are not used anywhere in
                                                                the code. we declare a super_operations structure, containing no
                                                                operations, then we use this structure to fill the .s_op member when
                                                                declaring a super_block structure, also filling the .s_inodes member
                        move remote_file_vfsmnt inside of       with a new LIST_HEAD. struct remote_file_vfsmnt is a "empty"
                        rdentry_create_file.                    vfsmount structure, contining five list heads, and a mount count. it is
                                                                declared to be its own parent. rdentry_add_entry creates an to
                                                                om_remote_dentry structure to contain a passed in dentry. it
                                                                allocates the om_remote_dentry from GFP_KERNEL, sets the dentry member
                                                                to the passed dentry, aquires the remote_dentries spinlock, adds the
                                                                om_remote_entry to the remote_dentries list, and releases the spinlock.
                                                                if the kmalloc fails, we return -ENOMEM, otherwise we return 0.
                                                                rdentry_create_dentry is called by rdentry_create_file to create a new
                                                                dentry coresponding to the passed in rfile_inode_data. along the way,
                                                                it also registers the dentry with rdentry_add_entry. first, we create
                                                                a new inode, backed by our dummy rfiles_dummy_block. we create a
                                                                duplicate of the passed in rfile_inode_data allocated from GFP_KERNEL,
                                                                and set inode->u.generic_ip (the inodes private data space) to point to
                                                                the new copy. the inode's file and address space operations are pointed
                                                                to our earlier stubs remote_file_operations, and remote_aops. we
                                                                allocate a dentry using d_alloc, set its inode to this new inode, set
                                                                its .name to be "/", and makes it its own parent. we use
                                                                rdentry_add_entry to add this to our remote_dentries list, and return
                                                                the new dentry. the error handling in this function seems VERY broken.
                                                                if either of our alloc calls fails (kmalloc or d_alloc), we free our
                                                                passed in data(!), call iput on our allocated inode, and return NULL.
                                                                rfile_inode_get_data is a wrapper returning inode->u.generic_ip.
                                                                rfiles_inode_get_file is a wrapper returning
                                                                rfile_inode_get_data(inode)->file. rfiles_inode_compare is a wrapper
                                                                that memcmps the passed inode's private data space against a supplied
                                                                rfile. returning the result. rdentry_find finds a rdentry whos dentry's
                                                                inode matches the passed in inode. it grabs the remote_dentries
                                                                spinlock, and uses list_for_each_entry to cycle through all of the
                                                                rdentries, comparing to rdentry->dentry->d_inode. if it finds a match,
                                                                it breaks out, unlocks the spinlock and returns the dentry of the
                                                                rdentry structure that was a match. otherwise, it unlocks the spinlock
                        verify this works.                      and returns NULL, due to the last dentry being NULL.
                                                                rdentry_create_file creates a file pointer matching the supplied
                                                                rfile_inode_data. it uses get_empty_filp to create an empty file
                                                                pointer, then uses dget(rdentry_find(data)) to get a dentry pointing to
                                                                the passed rfile_inode_data. if dget fails, we call
                                                                rdentry_create_entry to create a dentry pointing to our passed
                                                                rfile_inode_data. if our rdentry_create_entry call fails, we call
                                                                put_filep to close our file pointer, and return NULL. otherwise, we use
                                                                the remote_file_operations and remote_file_vfsmnt structures to set the
                                                                file pointer's f_op and f_vfsmnt members, set f_dentry to our dentry,
                                                                and mark the file pointer FMODE_READ. we then return the file pointer.
                                                                task_rfiles_get is called by mig_do_receive_vma and remote_do_mmap to
                                                                search through the processes' vma pages, and check to see if any of
                                                                them have a paticular file associated with them. first, we construct a
                                                                rfile_inode_data containing our passed in origfile, node, and isize.
                                                                we then compare it against our list of rdentry files, using
                                                                rfiles_inode_compare. if rfiles_inode_compare returns true,
                                                                task_rfiles_get returns the file pointer associated to the inode in
                                                                question. if not, it calls rdentry_create_file to create a
                                                                new rdentry containing the passed in file, an returns s the file
                                                                pointer returned from rdntry_create_file.
039     kcomd           creates hpc/kcomd.c                     kernel-to-kernel socket communication code. this file is set up to      <linux/sched.h>
                                                                create a kcomd.ko kernel module. it starts with three socket_           <linux/socket.h>
                                                                functions. socket_listen creates a socket, calls sock_map_fd to         <linux/in.h>
                                                                associate an fd to the socket, binds to it using its sock->ops->bind(), <linux/in6.h>
                                                                starts listening using its sock->ops->listen(), sets the passed in      <linux/net.h>
                                                                pointers res to point to the newly created socket, and returns the file <linux/syscalls.h>
                                                                descriptor to the now established stream. if sock_create fails, we      <net/sock.h>
                                                                return -1. the sock_map_fd fails, we release our sock, assign NULL to   <net/tcp.h>
                                                                the address passed via res, and return -1. if either our bind or listen
                                                                fails, we close our fd, release our sock, assign NULL to res, and
                                                                return -1. socket_listen_ipv4 and socket_listen_ipv6 are called by
                                                                kcomd_thread to set up the correct type of listening socket. both
                                                                these functions are wrappers of the above socket_listen function.
                                                                they set up their appropriate type of sockaddr structure, and call
                        move these structures to a private      socket_listen. struct kcom_pkt is designed to contain a packet destined
                        header.                                 to a remote kernel. struct kcom_node is a container for a socket, and
                                                                the information reguarding the node it points to. kcom_task is the
                                                                structure that contains kcomd's knowlege about a migrated process. it
                                                                contains the pid of the process in question, a kcom_node structure
                                                                defining what node a process is on, a list of processes communicating
                                                                with this node(?), a list containing outgoing packets, and a space for
                                                                one incoming packet. we define a spinlock and a list_head for
                                                                containing kcom_nodes. we then define sockets_fds as a fd_set_bits
                                                                structure. this structure is a more scalable version of a fd_set, used
                                                                by do_select. we then declare sockets_fds_bitmap and maxfds, which are
                                                                set and used by the next function, alloc_fd_bitmap, to hold a
                                                                dynamically grown array of fds. alloc_fd_bitmap takes the passed in fd
                                                                count, and if its greater than what the current sockets_fds_bitmap was
                                                                created to hold, frees sockets_fds_bitmap (and its contents), and
                                                                allocates a new one. if kmalloc fails, we return ENOMEM. otherwise, we
                                                                set the in, out, ex, res_in, res_out, and res_ex members of the
                                                                sockets_fds structue to offsets of our sockets_fds_bitmap structure,
                                                                and return 0. kcom_pkt_create creates a new kcom_pkt structure with the
                                                                len, type, and data members initialized to the passed in values. if
                                                                kzalloc fails, we return NULL. __kcom_node_find is called by the later
                                                                defined kcom_node_find to do the work of finding a node in our
                                                                kcom_nodes list that uses the passed sockaddr to communicate.  we use
                        doublecheck this return                 list_for_each_entry and memcmp to compare the address of our sock with
                        BUG: note the fixme reguarding memcmp   the address of our node(!). this function will return NULL if it fails.
                                                                kcom_node_find wraps __kcom_node_find, grabbing the kcom_nodes_lock
                                                                before entry, and releasing it afterward. kcom_node_add is called by
                                                                accept_connection to create a new kcom_node struct, and adds it to the
                                                                kcom_nodes list. there is code commented out reguarding finding out if
                                                                the node is already in the list, but its incomplete. kcom_node_del
                                                                removes a node from the kcom_nodes list that uses the passed in
                                                                sockaddr. we aquire the kcom_nodes spinlock, then use __kcom_node_find
                                                                to find the node structure to be deleted. if we don't find one, we
                                                                release the kcom_nodes spinlock, and return -ENOENT. otherwise, we call
                                                                list_del to remove the node from our node list, release the spinlock,
                                                                close its fd, release its socket, free the node structure's memory, and
                        pull dead code.                         return 0. comm_simple is a stub that returns 0, and is not used
                                                                elsewhere in the code. we then declare comm_ack, comm_iovec, and
                                                                comm_iovec_ack, which also are not used anywhere else.
                                                                accept_connection is called by kcomd_thread (declared later), to accept
                                                                an incoming connection on a passed in socket. it starts by allocating
                                                                a new socket, and calling the accept() operation of the passed in
                                                                socket to accept a connection from the passed in socket, on our new
                                                                socket. theres a block of commented out code, for checking if a node
                                                                is already in our node_list, but its unused/incomplete. we then use
                                                                sock_map_fd to get a file descriptor to this socket, add the node this
                                                                socket is communicating to to our node_list, and return our file
                                                                descriptor. if our socket allocation returns null, we return -1. if our
                                                                accept or sock_map_fd have problems, we release our socket, and return
                                                                -1. if our kcom_node_add fails, we close our fd, release our socket,
                                                                then return -1. data_read, data_write, and dispatch are all stubs that
                                                                return 0. data_read and data_write are called by kcomd_thread.
                                                                kcom_task_create creates a kcom_task structure allocated from
                                                                GFP_KERNEL for a given kcom_node and PID, initializing the pid, node,
                                                                and list members. if the kzalloc returns NULL, we return NULL.
                                                                kcom_task_delete deletes the first entry in the nodes list that matches
                                                                the given PID. these task list manipulation functions are missing the
                                                                spinlock code that the above node_list manipulation code has.
                                                                __kcom_task_find and kcom_task_find are formed like the above node find
                                                                code, but without its spinlock code. kcom_task_send uses
                                                                kcom_pkt_create to add a packet to the task structure belonging to the
                                                                pid passed in. it has comments reguarding sleeping and replying, but
                                                                instead it returns 0. kcomd_thread is the function executed in kernel
                                                                space, as a kernel thread. first, we call daemonize to create a "kcomd"
                                                                process. we then wait for a connection on an ipv4 and an ipv6 socket.
                                                                when we receive a connection, we enter a large while loop (which we
                                                                never exit?). in this loop, we first call alloc_fd_bitmap to make sure
                                                                our fd bitmap is big enough to hold maxfds number of fds. we then zero
                                                                the in, out, and ex fd sets, add our two listening sockets to the in
                                                                set, add the listening fds of each node in our node_list to the in set,
                                                                add each fd in our node list that we have packets to send on to the out
                                                                set, zero the res_in, res_out, and res_ex set of fds, and call select.
                                                                if select returns -1, we return to the top of our loop. otherwise, we
                                                                test wether our v4 or v6 listening socket received a connection. if so,
                                                                we call accept_connection. we then test each fd belonging to our list
                                                                of nodes, and if they have data to read, call data_read (a NOP!), or
                                                                if they have data to be written call data_write(also a NOP!).
                                                                at this point, we return to the top of our never-ending while loop.
                                                                kcom_init calls kernel_thread to start the aforementioned kcomd_thread
                                                                function. the rest of the file is just module glue for creating a kcomd
                                                                module, licensing it GPL, and attributing Vincent Hanquez as the
                                                                author.
040     config          creates hpc/Kconfig                     this file defines our openmosix menu options in the kernels
                                                                configuration system (menuconfig). we declare a top level menu titled
                                                                "HPC Options". our configuration options all exist under this entry.
                                                                first, we create an entry defining KCOMD as a tristate, or an item that
                                                                can be either on (in the kernel), off, or a module (loadable and
                                                                unloadable while the kernel is running). next we create an entry
                                                                defining OPENMOSIX as bool (in kernel, or not). this turns on or off
                                                                the parts of openmosix that have to be in-kernel for openmosix to
                                                                function. bool OPENMOSIX_VERBOSE is supposed to make openmosix more
                                                                verbose, but just serves to make OPENMOSIX_MIGRATION_VERBOSE and
                                                                OPENMOSIX_DEBUG_FS visible. bool OPENMOSIX_MIGRATION_VERBOSE enables
                                                                debugging messages of the form OM_VERBOSE_MIG(...) in
                                                                include/hpc/prototype.h. bool OPENMOSIX_DEBUG accomplishes many things.
                                                                first, it enables compilation and inclusion of hpc/debug.c, and an
                                                                archetecture specific hpc/debug-$(ARCH).c, both of which contain
                                                                functions for printing the state of various structures, processor
                                                                registers, and other associated values. then, it enables debugging
                                                                messages of the form OMDEBUG(...) in include/hpc/debug.h. it enables
                                                                the tracking of the contents of the structure openmosix_options in
                                                                include/hpc/hpc.h, and makes OPENMOSIX_MIGRATION_DEBUG and
                        remove dead code.                       OPENMOSIX_DEBUG_FS visible. bool OPENMOSIX_MIGRATION_DEBUG
                                                                dosent do anything, and can be safely removed. bool OPENMOSIX_DEBUG_FS
                                                                enables the compilation and inclusion of the contents of hpc/debugfs.c,
                                                                creating the om/ directory and its contents under the debugfs. bool
                                                                OPENMOSIX_CTRL_FS enables the compilation and inclusion of
                                                                hpc/ctrlfs.c, which is the control filesystem used to tell the kernel
                                                                to migrate processes, as well as where to tell what node a process is
                                                                running on.
041     openmosix       creates hpc/kernel.c                    this file is the kernel's interface to the openmosix system. it
                                                                contains only functions that are meant to be called by the kernel.
                                                                first, we export our openmosix_options datastructure, which contains
                                                                four constants that are used as "ceilings" for the OMDEBUG_* debugging
                                                                macros, settable through the debugfs. openmosix_pre_clone is called
                                                                when a process requests the clone syscall, before the kernel starts
                                                                processing it. in this function, we check wether the current process
                                                                has requested a shared memory space between the two clones, and if it
                                                                has, we mark the process as un-migratable for that reason, and increase
                                                                the usage count on its mm structure. note that as a result, both
                                                                processes will be marked DSTAY_CLONE, and both will have a usage count
                                                                +1 on the mm structure. processes are started with a usage count of 1.
                                                                openmosix_post_clone is called by the clone syscall, on the thread of
                                                                the parent, not the child, after the clone is completed. it checks the
                        magic!                                  mm_realusers counter. if its just 1, then somehow the process magically
                        is this supposed to happen when a       decresed its usage flag, and we clear the DSTAY_CLONE flag given to it
                        child dies, or otherwise drops the      by openmosix_pre_clone. task_maps_inode is supposed to check wether a
                        shared mm? stub!                        given task maps a given inode, but is just a stub.
                        monkey?                                 openmosix_no_longer_monkey is called from __remove_shared_vm_struct
                                                                to check every process on the machine and see wether its using the
                                                                passed in inode. if it is, we set the DREQ_CHECKSTAY flag, as this
                                                                inode is about to be removed from service, and doing such may make this
                                                                process migratable. we aquire the tasklist_lock around our invocation
                                                                of for_each_process(). since the previous function is a stub, this
                                                                function does nothing. stay_me_and_my_clones is called by sys_mlock and
                                                                sys_mlockall in mm/mlock.c, as well as do_mmap_pgoff in mm/mmap.c. it
                                                                applies a given bitmask of reasons to the current task, and all tasks
                                                                that share its mm structure. first, it uses task_lock to lock the
                                                                current process, sets its stay reason, and task_unlocks. if the number
                                                                of mm_realusers is greater than one (some other process uses this
                                                                processes mm structure), we grab the tasklist_lock, use
                                                                for_each_process to search for processes with the same mm pointer, that
                                                                aren't the current process, and use task_lock/task_set_stay/task_unlock
                                                                to add our stay reasons to the found processes. obtain_mm is called by
                                                                mig_handle_migration() in hpc/migrecv.c and task_local_bring() in
                                                                hpc/migctrl.c to allocate a new mm structure, initialize it, an make
                                                                it the context of the current process. we start by checking to see if
                                                                there is currently a mm structure associated with the passed in task.
                                                                if there is, we call panic() to print a debugging message. we then
                                                                mm_alloc() a new mm, initialize it to hold our given task with
                                                                init_new_context(), aquire the mmlist_lock, initialize our new mm's
                                                                mmlist member with the mmlist of process zero, and release the
                                                                mlist_lock. we then assign this mm to our process by first aquiring the
                                                                task_lock(), saving our curent active mm, setting the task's active mm
                                                                and mm to our newly created mm, and task_unlock()ing. we call
                                                                activate_mm with our origional and new mm, then mmdrop the old
                                                                active_mm. if our mm_alloc() fails, we return -ENOMEM. if
                                                                init_new_context() fails, we destory our allocated mm, and return the
                                                                error init_new_context() failed with. otherwise, we return 0 for
                                                                success. unstay_mm is called by sys_munlock and sys_munlockall in
                                                                mm/mlock.c to request a re-evaluation of the stayability of the
                                                                current process, and all processes that share its mm structure. for the
                        premature optimization? looks good tho. common case of just one task using a given mm structure, we just call
                                                                task_set_dreqs(current, DREQ_CHECKSTAY). otherwise, we use
                                                                for_each_process() with a read_lock held on the tasklist_lock to
                                                                iterate through each process on the machine, checking if its using our
                                                                passed mm, and if so, we call task_set_dreqs(p, DREQ_CHECKSTAY) on it.
                                                                remote_pre_usermode is called by the later defined
                                                                openmosix_pre_usermode to check for communication events before
                                                                entering userspace. it calls comm_peek() to see if theres pending
                                                                input, and if there is, it calls remote_do_comm() to process the
                                                                communication in question.  remote_pre_usermode always returns 0 for
                                                                success. deputy_pre_usermode is also called by openmosix_pre_usermode,
                                                                before jumping to userspace while handling a process in deputy state.
                                                                in this function, we just jump into deputy_main_loop, instead of going
                                                                to any real usermode code. when deputy_main_loop() returns, we return
                                                                0. openmosix_pre_usermode is called by assembly code in
                                                                arch/$ARCH/kernel/entry, when switching from kernel space to user
                                                                space. we first check for pending dreqs, and if we finds one, we save
                                                                our current irq mask, call task_do_request, and restore our irq mask
                                                                once task_do_request returns. after dispatching dreqs, we call one of
                                                                the previous two functions depending on wether the process is in
                                                                DDEPUTY or DREMOTE state. like before, we save our irq mask before
                                                                calling remote_pre_usermode or deputy_pre_usermode, then restore them
                                                                once we return from userspace. this function always returns 0 for
                                                                success. openmosix_init is called on subsystem load. it starts the
                                                                mig_daemon kernel thread, and returns 0. the last line in this file
                                                                tells the subsystem system to call openmosix_init on initializing this
                                                                kernel component.
042     config          creates hpc/Makefile                    this Makefile contains the make fragments that tells the kernel
                                                                what targets to build in the hpc/ directory. this code contains
                                                                five targets, obj-$(CONFIG_KCOMD) obj-$(CONFIG_OPENMOSIX)
                                                                obj-$(CONFIG_OPENMOSIX_CTRL_FS) obj-$(CONFIG_OPENMOSIX_DEBUG)
                                                                and obj-$(CONFIG_OPENMOSIX_DEBUG_FS). each of these targets matches
                                                                with the configuration variables defined by hpc/Kconfig.
                                                                obj-$(CONFIG_KCOMD) says to compile kcomd.o. obj-$(CONFIG_OPENMOSIX)
                                                                says to compile kernel.o, task.o, comm.o, remote.o, deputy.o,
                                                                copyuser.o, files.o, syscalls.o, migrecv.o, migsend.o, migctrl.o,
                                                                service.o, proc.o, and an arch-$(ARCH).o file containing archetecture
                                                                specific functionality. proc.o has a comment noting that its "legacy
                                                                code". obj-$(CONFIG_OPENMOSIX_CTRL_FS) says to create ctrlfs.o.
                                                                obj-$(CONFIG_OPENMOSIX_DEBUG) says to include debug.o, and
                                                                an archetecture specific debug-$(ARCH).o. finally,
                                                                obj-$(CONFIG_OPENMOSIX_DEBUG_FS) says to include debugfs.o.
043     ominterface     creates hpc/migctrl.c                   this file contains functions for moving processes via openmosix.
                                                                task_remote_expel is called either by task_remote_wait_expel or
                                                                remote_do_comm() in hpc/remote.c, to send a DREMOTE process back to
                                                                its origional node, merging it with its deputy. first we check to make
                                                                sure the task we've been passed is in DREMOTE state, and use BUG_ON
                                                                if its not. then, we use mig_send_hshake to request a migration back
                                                                from the home node. if this succeeds, we call mig_do_send to actually
                                                                perform the migration. after that, we destroy our link to the home node
                                                                by using task_set_comm to associate our link to null, then calling
                                                                comm_close() against our old link (returned by task_set_comm). we then
                        who gets this result?                   call do_exit(SIGKILL) to end the process. in case either of our
                                                                mig_send_hshake or mig_do_send calls fail, we OMBUG("failed\n"), and
                                                                return -1. task_remote_wait_expel is called by __task_move_to_node, to
                                                                return a task to the home node. it wraps the previous function, first
                                                                requesting permission to return home by sending a REM_BRING_HOME req,
                                                                then waiting on a DEP_COMING_HOME reply. if comm_recv fails, or we recv
                                                                something other than a DEP_COMING_HOME, we return -1. otherwise, we
                                                                call task_remote_expel. task_local_send is called by
                                                                __task_move_to_node to send a local task to a remote host. first, we
                                                                check to make sure the task is not in DDEPUTY state. if it is, we
                        returning success in case of 'error'?   return 0, as this process is already running on a remote node.
                                                                otherwise, we open a new connection using sockaddr_setup_port and
                                                                comm_setup_connect, then attach it to this process with
                                                                task_set_comm. we set the current process into DDEPUTY state, and
                        why use hshake here, and req above?     ask permission to send by sending a HSHAKE_MIG_REQUEST using
                                                                mig_send_hshake. if that succeeds, we call mig_do_send to actually send
                                                                the process to the remote node. when mig_do_send returns successfully,
                                                                the process has been sent to the remote node, and the local process is
                                                                now a deputy. we call deputy_startup, and return 0. if either
                                                                comm_setup_connect, mig_send_hshake, or mig_do_send returns failure, we
                                                                remove our DDEPUTY flag, destroy our link to the remote node (if
                                                                applicable), and return 0. task_local_bring is called by
                                                                __task_move_to_node to return a remote process to the current node,
                                                                re-merging it with its deputy. first, we check to make sure the current
                        returning success in case of error!     task is in DDEPUTY state. if its not, we return 0. we then use
                                                                obtain_mm to get a new mm struct. we then make a DEP_COMING_HOME
                                                                request to the remote end, and use mig_recv_hshake to receive