{"id":159,"date":"2008-02-19T22:18:06","date_gmt":"2008-02-19T21:18:06","guid":{"rendered":"http:\/\/www.nobugs.org\/blog\/archives\/2008\/02\/19\/sleight-of-haskelly-hand-and-the-appearance-of-a-process\/"},"modified":"2009-01-11T01:03:19","modified_gmt":"2009-01-11T00:03:19","slug":"sleight-of-haskelly-hand-and-the-appearance-of-a-process","status":"publish","type":"post","link":"https:\/\/www.nobugs.org\/blog\/archives\/2008\/02\/19\/sleight-of-haskelly-hand-and-the-appearance-of-a-process\/","title":{"rendered":"Sleight of Haskelly Hand (and The Appearance Of A Process)"},"content":{"rendered":"<p>Here&#8217;s some low-level hackery fun which revealed something I didn&#8217;t know about unix until yesterday.  <a href=\"http:\/\/haskell.org\/haskellwiki\/Yi\">Yi<\/a> (the emacs clone in haskell) currently implements &#8220;code updating&#8221; by persisting the application state, and calling exec() to replace the program code with the latest version, and then restores the previous state.  Lots of applications do this to some extent. However, yi needs to be a bit smart because (like emacs) it can have open network connections and open file handle which also need to survive the restart but aren&#8217;t trivially persistable.  For example, yi could be running subshells or <a href=\"http:\/\/delysid.org\/emacs\/erc.html\">irc clients<\/a>.<\/p>\n<p>Fortunately, this is possible!  When you call exec(), existing file descriptors remain open.  This is very different from starting a new process from scratch. So all we need to do is persist some information about which descriptors were doing which particular job.  Then, when we start up again, we can rewire up all our file handles and network connections and carry on as if nothing has happened.<\/p>\n<p>Here&#8217;s an example haskell app which shows this in action.  First of all we need to import various bits:<\/p>\n<pre lang=\"haskell\"> \r\nimport System.Posix.Types\r\nimport System.Posix.Process\r\nimport System.Posix.IO\r\nimport System.IO\r\nimport Network.Socket\r\nimport System( getArgs, getProgName )\r\nimport Foreign.C.Types\r\n<\/pre>\n<p>Next we have a &#8220;main&#8221; function which distinguishes between &#8220;the first run&#8221; and &#8220;the second run&#8221; (ie. after re-exec&#8217;ing) by the presence of command line arguments:<\/p>\n<pre lang=\"haskell\"> \r\nmain  :: IO ()\r\nmain = do\r\n  args < - getArgs\r\n  case args of\r\n    [] -> firsttime\r\n    [ file_fd, net_fd ] -> reuse (read file_fd) (read net_fd)\r\n<\/pre>\n<p>The first time we run, we open a network connection to http:\/\/example.com and we also open a disk file for writing.  We then re-exec the current process to start over again, but also pass the disk file fd as the first command line argument, and the network socket fd as the second argument.  Both are just integers:<\/p>\n<pre lang=\"haskell\"> \r\nfirsttime :: IO ()\r\nfirsttime = do \r\n  -- Open a file, grab its fd\r\n  Fd file_fd < - handleToFd =<< openFile \"\/tmp\/some-file\" WriteMode\r\n\r\n  -- Open a socket, grab its fd\r\n  socket <- socket AF_INET Stream defaultProtocol \r\n  addr <- inet_addr \"208.77.188.166\" -- example.com\r\n  connect socket (SockAddrInet 80 addr)\r\n  send socket \"GET \/ HTTP\/1.0\\n\\n\"\r\n  let net_fd = fdSocket socket\r\n\r\n  -- rexec ourselves\r\n  pn <- getProgName\r\n  putStrLn $ \"Now re-execing as \" ++ pn ++ \" \" ++ show file_fd ++ \" \" ++ show net_fd\r\n  executeFile (\".\/\" ++ pn) False [ show file_fd, show net_fd ] Nothing\r\n<\/pre>\n<p>The second time we run, we pick up these two file descriptors and proceed to use them.  In this code, we read an HTTP response from the network connection and write it to the disk file.<\/p>\n<pre lang=\"haskell\"> \r\nreuse :: CInt -> CInt -> IO ()\r\nreuse file_fd net_fd = do\r\n  putStrLn $ \"Hello again, I've been re-execd!\"\r\n\r\n  putStrLn $ \"Using fd \" ++ show net_fd ++ \" as a network connection\"\r\n  socket < - mkSocket net_fd AF_INET Stream defaultProtocol Connected\r\n  msg <- recv socket 100\r\n\r\n  putStrLn $ \"Using fd \" ++ show file_fd ++ \" as an output file\"\r\n  h <- fdToHandle (Fd file_fd)\r\n  hPutStrLn h $ \"Got this from network: \" ++ msg\r\n\r\n  hClose h\r\n  sClose socket  \r\n\r\n  putStrLn \"Now look in \/tmp\/some-file\"\r\n<\/pre>\n<p>.. and we end up with the file containing text retrieved from a network connection which was made in a previous life.  It is a curious and useful technique.  But I find it interesting because it made me realise that I usually think of a \"unix process\" as being the same thing as \"an instance of grep\" or \"an instance of emacs\".  But a process can change its skin many times during its lifetime.  It can \"become\" many different creatures by exec()ing many times, and it can keep the same file descriptors throughout.  I've only ever seen exec() paired with a fork() call before, but that's just one way to use it.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Here&#8217;s some low-level hackery fun which revealed something I didn&#8217;t know about unix until yesterday. Yi (the emacs clone in haskell) currently implements &#8220;code updating&#8221; by persisting the application state, and calling exec() to replace the program code with the latest version, and then restores the previous state. Lots of applications do this to some [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2],"tags":[9,13],"class_list":["post-159","post","type-post","status-publish","format-standard","hentry","category-programming","tag-haskell","tag-unix"],"_links":{"self":[{"href":"https:\/\/www.nobugs.org\/blog\/wp-json\/wp\/v2\/posts\/159","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.nobugs.org\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.nobugs.org\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.nobugs.org\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.nobugs.org\/blog\/wp-json\/wp\/v2\/comments?post=159"}],"version-history":[{"count":1,"href":"https:\/\/www.nobugs.org\/blog\/wp-json\/wp\/v2\/posts\/159\/revisions"}],"predecessor-version":[{"id":251,"href":"https:\/\/www.nobugs.org\/blog\/wp-json\/wp\/v2\/posts\/159\/revisions\/251"}],"wp:attachment":[{"href":"https:\/\/www.nobugs.org\/blog\/wp-json\/wp\/v2\/media?parent=159"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.nobugs.org\/blog\/wp-json\/wp\/v2\/categories?post=159"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.nobugs.org\/blog\/wp-json\/wp\/v2\/tags?post=159"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}