Revision [17881]

This is an old revision of WikkaGopher made by BrianKoontz on 2007-12-27 02:08:39.

 





Rationale

The idea here is to facilitate the creation and maintenance of gopher content using existing Wikka features. The goal is not to write a new gopher server, but to manage content created by Wikka so that a gopher server can access and serve up Wikka-generated content. (This is similar to using Wikka as an HTMLHandler HTML generator, using Wikka markup to generate and serve HTML content.) There's no reason why Wikka couldn't serve up gopher content as well!

Proof of Concept

As it turns out, I did have a need to satisfy the following:

There's no reason why Wikka couldn't serve up gopher content as well!

I have a repository of files I wanted to serve up, but the machine is internal, and I really don't have a desire to (1) open access to the outside world, (2) serve them with all the overhead associated with a web server such as Apache, or (3) move the files to a machine that is accessible to the outside world. Gopher is a lightweight protocol that is ideal for serving up filesystems in situ, without having to deal with presentation issues or other needless overhead. I thought it would be interesting to have Wikka serve as a "gopher proxy," permitting access to gopherspace without regard to whether or not a user's browser supports the gopher protocol. As a proof of concept, I wrote some code that can access gopher sites, display (in a very rudimentary fashion) the site files and directories, and even download text and binary files.

Please note: This code is very unrefined, and is definitely not for use in a production environment! It is very likely things don't work (in fact, I deliberately failed to implement several gopher item types so that I could focus on just getting something to work), and I seriously doubt it's anywhere near being compliant to RFC 1436. However, it works with my gopher server, and fulfills the rather meager requirements I had.

That said, I offer up my initial hacks and welcome a brave soul who might be willing to step forward and see if they can create a gateway to gopherspace.

ToDo List

Where to start?


Security Implications

  1. Open proxies can, and will, be abused. An open proxy is an invitation for someone to use your machine as an anonymous scanner, which means your machine's IP address will show up in logs. While gopherspace is not that large at the moment, this code does not restrict access based upon host and/or port, so a malicious user could set up base on your Wikka server to perform anonymous scans. A user-modifiable list of allowable hosts and/or ports should be enforced.
  1. This proxy cannot positively identify remote connections as gopher servers. The gopher protocol does not provide a method for positively identifying a remote server as being a gopher server. Any server that responds to <LF><CR> will have its output parsed as if it was valid gopher data. Better checks need to be put into place to ensure that only data that is consistent with RFC 1436 is parsed and/or processed.
  1. Your local (internal) interfaces can be accessed. It's likely your internal machines sit behind the same firewall as your Wikka machine. The proxy should probably be smarter in determining whether or not local IP addresses (including localhost/127.0.0.1) can be accessed via the proxy.
  1. Denial of service (DOS) by specifying invalid ports. Right now, I'm watching my test server just spin away after asking the proxy to connect to an internal mail server. The proxy in its current state is a quite effective DOS vector. A more consistent means of timing out the connection, coupled with restricting port access, should probably be implemented.
  1. DOS through excessive network connections/bandwidth usage. A malicious user could generate many network connections to a gopher site that could effectively impair both the proxy server and the gopher server. Also, the proxy currently reads content into a buffer to determine file size, mime type, etc., so many requests could effectively slow down the proxy server due to memory depletion. Limits on the number of network connections allowed as well as limits on bandwidth usage need to be implemented. (I've been told that browsers do not need to be sent file size and mime type information in the request headers to successfully download binary files. Also, the Gopher+ protocol does allow for the transmission of metadata, which would make it unnecessary for the proxy to buffer binary data from the gopher server.)

System Requirements

You must have a version of PHP that is compiled with the --enable-sockets option. This simply will not work without this option enabled. I believe the socket extensions have been moved to PECL (bleh!) as of PHP 5.3.0, so I doubt this code will work without some modification. I'm running this on my test server with PHP 4.3.10, Apache 2.0, and the latest version of WikkaWiki 1.1.6.4 from the WikkaSVN SVN repository.

Getting Down and Dirty

OK, here it is! It's ugly, unrefined, uncommented, and the error handling doesn't work (because I'm still trying to decide if the client is responsible for displaying error messages, or the underlying classes). But, if you drop these files into your actions/ and handlers/page/ directories, it should work without much modification. At some point, I do plan on tidying things up. Feel free to edit this page (it's a wiki after all) with comments, code, and criticisms. I can handle it all.

Typical usage
{{gopher uri="quux.org"}}


actions/gopher.php
<?php
   /*
    * This program is free software; you can redistribute it and/or
    * modify it under the terms of the GNU General Public License as
    * published by the Free Software Foundation; either version 2 of
    * the License, or (at your option) any later version.
    *
    * This program is distributed in the hope that it will be useful,
    * but WITHOUT ANY WARRANTY; without even the implied warranty of
    * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
    * GNU General Public Licence for more details:
    *
    *            http://www.gnu.org/copyleft/gpl.html
    *
    *
    * @author        {@link http://wikkawiki.org/BrianKoontz Brian Koontz} <brian@pongonova.net>
    * @copyright    Copyright (c) 2007, Brian Koontz <brian@pongonova.net>
    */

    include_once('actions/gopherclient/gopherproxy.php');
    include_once('actions/gopherclient/gopherclient.php');
    if(isset($_GET['uri']))
    {
        $vars['uri'] = $_GET['uri'];
    }
    $uri = $this->cleanURL($vars['uri']);
    // Strip protocol
    if(preg_match('/^.*\/\/(.*)$/', $uri, $matches))
    {
        $uri = $matches[1];
    }

    // Separate host from selector
    $selector = '';
    $item_type = '';
    $host = '';
    if(strpos($uri, "/") > 0)
    {
        $fields = explode("/", $uri, 3);
        $host = $fields[0];
        $item_type = $fields[1];
        $selector = $fields[2];
    }
    else
    {
        $host = $uri;
    }
    if(!isset($item_type) || '' == $item_type)
    {
        $item_type = 1;
    }

    // Gopher it!
    $gp = new GopherProxy($host);
    $prefix = $this->href()."/gopher?uri=";
    $gc = new GopherClient($prefix);
    $result = $gp->ProcessRequest($selector);
    $gc->ParseResponse($item_type, $result, $selector);
?>


actions/gopherclient/gopherclient.php
<?php
/********************************************************************
 * gopherclient.php - Client for parsing/displaying gopher responses
 *
 * This program is free software; you can redistribute it and/or
 * modify it under the terms of the GNU General Public License as
 * published by the Free Software Foundation; either version 2 of
 * the License, or (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
 * GNU General Public Licence for more details:
 *
 *            http://www.gnu.org/copyleft/gpl.html
 *
 *
 * @author        {@link http://wikkawiki.org/BrianKoontz Brian Koontz} <brian@pongonova.net>
 * @copyright    Copyright (c) 2007, Brian Koontz <brian@pongonova.net>
 *
 ********************************************************************/


if(!defined('GOPHERCLIENT_GENERAL_ERROR')) define('GOPHERCLIENT_GENERAL_ERROR', 'General GopherClient error');

class GopherClient
{
    var $url_prefix;

    function GopherClient($url_prefix='')
    {
        $this->url_prefix = $url_prefix;
    }

    function ParseResponse($item_type, &$response, $selector='')
    {
        if(!isset($response))
        {
            return $this->ThrowError("Need a response to parse!");
        }

        // Special handling for text and binary files
        // (This should probably be put into its own function)
        $raw_data_item_types = array(0, 5, 9);
        $text_item_types = array(0);
        $binary_item_types = array(5, 9);
        if(TRUE===in_array($item_type, $raw_data_item_types))
        {
            if(TRUE===in_array($item_type, $binary_item_types))
            {
                // We need the selector for this...
                if(!isset($selector) || '' == $selector)
                {
                    return $this->ThrowError("Need the selector!");
                }
                // The following was adapted from code posted by
                // Hillar Aarelaid at
                // http://wikkawiki.org/FilesActionHillar
                $filename = basename($selector);
                if (preg_match("/.*\.(\w+)$/i",$filename,$res))
                    $suffix=$res[1];
                 // Search MIME Type
                 if (!$suffix || $suffix=="" || !$this->config['mime_types']
                    || !$mimes=implode("\n",file($this->config['mime_types'])))
                    $content_type="application/octet-stream";
                 else
                 {
                    if (preg_match("/([A-Za-z.\/-]*).*$suffix/i",$mimes,$result))
                       $content_type=$result[1];
                    else
                       $content_type="application/octet-stream";
                 }
                    header("Pragma: public");
                    header("Expires: 0");
                    header("Cache-Control: must-revalidate, post-check=0, pre-check=0");
                    header("Cache-Control: public");
                    header("Content-Description: File Transfer");

                    Header("Content-Type: ".$content_type);

                    //Force the download
                    Header("Content-Disposition: attachment; filename=".$filename);
                    header("Content-Transfer-Encoding: binary");
                    Header("Content-Length: ".strlen($response));
                    // Header("Connection: close");
                echo $response;
                exit;
            }
            else if(TRUE===in_array($item_type, $text_item_types))
            {
                $response = preg_replace('/\n/', "<br/>\n", $response);
                echo $response;
                return;
            }
            else
            {
                // Better check your item_type arrays...punting on
                // this one, let's hope it can be handled...
            }
        }

        // Explode on \r\n
        $lines = array_filter(explode("\r\n", $response));
        $last = array_pop($lines);
        // Some servers aren't returning a "." as the last line, so we
        // just can't throw that line away
        if(0 == preg_match('/^\.$/', $last))
        {
            array_push($lines, $last);
        }

        foreach($lines as $line)
        {
            $item_type = substr($line, 0, 1);
            $fields = explode("\t", substr($line, 1));
            echo $this->FormatResponseLine($item_type, $fields);
        }
    }

    /********************************************************************
     * fields[0] - Display string
     * fields[1] - Selector
     * fields[2] - Host
     * fields[3] - Port
     ********************************************************************/

    function FormatResponseLine($item_type, $fields)
    {
        $prefix = '';
        switch((string)$item_type)
        {
        case "0":
            $prefix = "[FILE] ";
            break;
        case "1":
            $prefix = "[DIR]  ";
            break;
        case "5":
            $prefix = "[BIN]  ";
            break;
        case "9":
            $prefix = "[BIN]  ";
            break;
        case "i":
            $prefix = '';
            break;          default:
            $prefix = "[UNK]  ";
            break;  
        }          
        $url = $this->url_prefix;
        $url .= rtrim($fields[2], "/");
        if(isset($fields[3]))
        {  
            $url .= ":$fields[3]";
        }      
        $url .= "/";
        $selector = ltrim($fields[1], "/");
        if(!empty($selector))
        {  
            $url .= $item_type."/";
            $url .= $selector;  
        }      
           
        if(empty($prefix))
        {
            return $fields[0]."<br/>\n";
        }
        else
        {
            return $prefix."<a href=\"".$url."\">".$fields[0]."</a><br/>\n";
        }
    }  
           
    function ThrowError($err='')
    {
        return GOPHERCLIENT_GENERAL_ERROR.": ".$err."\n";
    }  
}          
?>          


actions/gopherclient/gopherproxy.php
<?php
/********************************************************************
 * gopherproxy.php - Proxy for accessing gopher sites
 *
 ********************************************************************/


if(!defined('GOPHERPROXY_SOCKET_ERROR')) define('GOPHERPROXY_SOCKET_ERROR', 'Socket error');
if(!defined('GOPHERPROXY_READ_LENGTH')) define('GOPHERPROXY_READ_LENGTH', 1024);
if(!defined('GOPHERPROXY_SELECT_TIMEOUT')) define('GOPHERPROXY_SELECT_TIMEOUT', 5);

class GopherProxy
{
    var $wakka;
    var $host;
    var $port;   // Default: 70
    var $timeout; // Default: 30 seconds
    var $socket;
    var $config;

    function GopherProxy($wakka, $host, $port=70, $timeout=30)
    {
        $this->wakka = $wakka;
        include_once('actions/gopherclient/gopherproxy.config.php');
        $this->config = $gopherProxyConfig;
        $buffer = explode(",", $this->config['restrict_pages_to']);
        $restrict_pages = array_map("trim", $buffer);
        if(FALSE===in_array($this->wakka->GetPageTag(), $restrict_pages))
        {
            print $this->ThrowError("Not permitted to call this action from this page");
            exit;
        }
        if(strpos($host, ":") > 0)
        {
            $this->host = substr($host, 0, strpos($host, ":"));
            $this->port = substr($host, strpos($host, ":")+1);
        }
        else
        {
            $this->host = $host;
            $this->port = $port;
        }
        $this->timeout = $timeout;
    }

    function _setup_socket($request='')
    {
        if(FALSE===$this->ValidateHost($this->host, $this->port))
        {
            print $this->ThrowError("Access denied");
            exit;
        }
        $this->socket = socket_create(AF_INET, SOCK_STREAM, SOL_TCP);
        if(FALSE===$this->socket)
        {
            return $this->ThrowError();
        }
        if(FALSE===socket_set_nonblock($this->socket))
        {
            return $this->ThrowError("socket_set_noblock() failed");
        }

        //socket_bind($this->socket, '127.0.0.1', 0);

        if(FALSE===socket_set_option($this->socket, SOL_SOCKET, SO_REUSEADDR, 1))
        {
            return $this->ThrowError();
        }

        $time = time();
        while(FALSE===@socket_connect($this->socket, $this->host, $this->port))
        {
            $err = socket_last_error($this->socket);
            if($err == 115 || $err == 114)
            {
                if((time() - $time) >= $this->timeout)
                {
                    return $this->ThrowError("Connection timed out");
                }
                sleep(1);
                continue;
            }
            return $this->ThrowError();
        }
    }

    function _shutdown_socket()
    {
        if(isset($this->socket))
        {
            socket_close($this->socket);
        }
    }
    function ProcessRequest($request='')
    {      
        $this->_setup_socket();
       
        if(FALSE===$this->socket)
        {
            return $this->ThrowError();
        }  
       
        if(FALSE===socket_set_block($this->socket))
        {
            return $this->ThrowError();
        }

        if(FALSE===socket_write($this->socket, "$request\r\n"))
        {
            return $this->ThrowError();        }
       
        $read = array($this->socket);
        $buffer = '';
        while(true)
        {
            $select = socket_select($read, $write=NULL, $except=NULL, GOPHERPROXY_SELECT_TIMEOUT);
            if(FALSE !== $select && $select > 0)
            {
                $readbuf = socket_read($this->socket, GOPHERPROXY_READ_LENGTH);
                if(''==$readbuf)
                {
                    break;
                }
                while(0 != strlen($readbuf))
                {
                    $buffer .= $readbuf;
                    $readbuf = socket_read($this->socket, GOPHERPROXY_READ_LENGTH);    
                }
            }
            else
            {
                return $this->ThrowError();
                break;
            }
        }
        $this->_shutdown_socket();
        return $buffer;
    }
    function ThrowError($err='')
    {
        $this->_shutdown_socket();
        $this->socket = NULL;
        if(empty($err))
        {
            $err = socket_strerror(socket_last_error());
            return GOPHERPROXY_SOCKET_ERROR.": ".$err."\n";
        }
        return $err;
    }

    // Returns TRUE if access is permitted, FALSE otherwise
    function ValidateHost($host, $port=70)
    {
        // Check for allowed hosts first
        $vals = explode(",", $this->config['hosts_allow']);
        if(!empty($vals))
        {
            foreach($vals as $str)
            {
                $str = trim($str);
                $spec = explode(":", $str);
                if(preg_match("/$spec[0]/", $host) > 0)
                {
                    if(!isset($spec[1]) ||
                        preg_match("/$spec[1]/", $port) > 0)
                    {
                        return TRUE;
                    }
                }
            }
        }

        // Check for denied hosts
        $vals = explode(",", $this->config['hosts_deny']);
        if(!empty($vals[0]))
        {
            foreach($vals as $str)
            {
                $str = trim($str);
                $spec = explode(":", $str);
                if(preg_match("/$spec[0]/", $host) > 0)
                {
                    if(!isset($spec[1]) ||
                        preg_match("/$spec[1]/", $port) > 0)
                    {
                        return FALSE;
                    }
                }
            }
        }

        // No matches always default to allowed access
        return TRUE;
    }
}
?>


actions/gopherclient/gopherproxy.config.php
<?php  
   
// hosts_allow, hosts_deny:
//  
// Modeled after UNIX hosts_access. The proxy consults two settings,
// 'hosts_allow' and 'hosts_deny'. The host search stops at the first
// match:
//      - Access will be granted when a host matches on an entry in
//      'hosts_allow'.
//      - Otherwise, access will be denied when a host matches an
//      entry in 'hosts_deny'.
//      - Otherwise, access will be granted.
// A non-existing config entry is treated as if it were empty. Thus,
// access control can be turned off by providing no entries (or
// setting the entries to '').
//              
// Valid regexp expressions may be used.  It is up to you to ensure
// your regexp expressions work. If in doubt, simply list the IPs one
// at a time. To allow and/or deny all hostnames, use '.*'. Subnets
// (i.e., 192.168.0/24) are not supported at this time.
//                  
// You may optionally append a port number (i.e., 192.168.0.3:70).
// The port number is stripped prior to preforming the match;
// wildcards are not allowed here. If you don't want to restrict any
// ports, don't specify a port number.
//      
// restrict_pages_to:
//      
// Allow this action to be called only from the specified wiki pages.
// An empty value allows the action to be called from any page
// (meaning a user can create a page and call this action, so you
// probably want to set hosts_allow and hosts_deny as well).  
// Restricting write access to these pages will keep a malicious
// person from modifying the action params. Comma-separate multiple pages.
               
$gopherProxyConfig = array(
    'hosts_allow'=>'192\.168\.0\.3:70',
    'hosts_deny'=>'localhost, 127\.0\.0\.1, 192\.168\., 172\.',
    'restrict_pages_to' => 'GopherTest');
?>                  


handlers/page/gopher.php
<div class="page">
<?php
    include_once("actions/gopher.php");
?>
</div>
There are 3 comments on this page. [Show comments]
Valid XHTML :: Valid CSS: :: Powered by WikkaWiki