diff --git a/.gitignore b/.gitignore index 05b7be35..d6c99b1f 100644 --- a/.gitignore +++ b/.gitignore @@ -1,3 +1,5 @@ *.swp .ipynb_checkpoints/ .env/ +my_data.* +cv/ diff --git a/README.md b/README.md index cb09269b..a4883140 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,6 @@ # torch-rnn +This repository is a fork of [torch-rnn](https://github.com/jcjohnson/torch-rnn) with modifications to improve compatibility with more recent versions of some dependences and an added example of a network trained to simulate the contents of RFC standards. + torch-rnn provides high-performance, reusable RNN and LSTM modules for torch7, and uses these modules for character-level language modeling similar to [char-rnn](https://github.com/karpathy/char-rnn). @@ -11,10 +13,6 @@ the [Benchmark](#benchmarks) section below. # Installation -## Docker Images -Cristian Baldi has prepared Docker images for both CPU-only mode and GPU mode; -you can [find them here](https://github.com/crisbal/docker-torch-rnn). - ## System setup You'll need to install the header files for Python 2.7 and the HDF5 library. On Ubuntu you should be able to install like this: @@ -56,9 +54,22 @@ luarocks install optim luarocks install lua-cjson # We need to install torch-hdf5 from GitHub -git clone https://github.com/deepmind/torch-hdf5 +git clone https://github.com/anibali/torch-hdf5.git +# Note that if you use the original version of torch-hdf5 from deepmind +# it is currently incompatible with the HDF5 V1.1.0 library versions +# used by several recent OS versions including Ubuntu V17.04 and newer MacOS cd torch-hdf5 +git checkout hdf5-1.10 luarocks make hdf5-0-0.rockspec + +# There can be a problem finding the hdf5.h file from Lua. This may be a poor solution, but +# it works for me. + +# Find the hdf5.h file +find / -name hdf5.h 2>/dev/null +# edit the file ~/torch/install/share/lua/5.1/hdf5/config.lua +# and set "HDF5_INCLUDE_PATH to the directory that contains the hdf5.h file + ``` ### CUDA support (Optional) @@ -146,6 +157,14 @@ to run in OpenCL mode add the flag `-gpu_backend opencl`. There are more flags you can use to configure sampling; [read about them here](doc/flags.md#sampling). +# Example network - RFC simulation + +This repository contains an example of a trained LSTM network that simulates the contents of [rfc standards](https://en.wikipedia.org/wiki/Request_for_Comments) that are used to define Internet Protocols. The network has been trained with the text of all RFCs published to date. You can try generating your own RFC samples like this: + +```bash +th sample.lua -checkpoint example-rfcs/rfcs.t7 -length 2000 +``` + # Benchmarks To benchmark `torch-rnn` against `char-rnn`, we use each to train LSTM language models for the tiny-shakespeare dataset with 1, 2 or 3 layers and with an RNN size of 64, 128, 256, or 512. For each we use a minibatch size of 50, a sequence diff --git a/example-rfcs/rfcs.t7 b/example-rfcs/rfcs.t7 new file mode 100644 index 00000000..036793e5 Binary files /dev/null and b/example-rfcs/rfcs.t7 differ diff --git a/requirements.txt b/requirements.txt index e0ba7699..d92f7277 100644 --- a/requirements.txt +++ b/requirements.txt @@ -1,5 +1,5 @@ Cython==0.23.4 numpy==1.10.4 argparse==1.2.1 -h5py==2.5.0 +h5py==2.7.1 six==1.10.0