commit ad09f286e8f3f8972fe652cc183e4f1d8082eac5 Author: Donggang Cao Date: Mon Mar 14 18:43:44 2016 +0800 add userguide to docklet-book diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..93bfd12 --- /dev/null +++ b/.gitignore @@ -0,0 +1,2 @@ +*~ +.DS_Store diff --git a/README.md b/README.md new file mode 100644 index 0000000..953cb0b --- /dev/null +++ b/README.md @@ -0,0 +1,4 @@ +Docklet Document +================ + + diff --git a/userguide/.gitignore b/userguide/.gitignore new file mode 100644 index 0000000..0d73090 --- /dev/null +++ b/userguide/.gitignore @@ -0,0 +1,6 @@ +_book +*~ +.DS_Store +node_modules +.epub +book*pdf diff --git a/userguide/LANGS.md b/userguide/LANGS.md new file mode 100644 index 0000000..bf64d27 --- /dev/null +++ b/userguide/LANGS.md @@ -0,0 +1,2 @@ +* [中文](zh) +* [English](en) diff --git a/userguide/Makefile b/userguide/Makefile new file mode 100644 index 0000000..3613776 --- /dev/null +++ b/userguide/Makefile @@ -0,0 +1,30 @@ +# generating html site using gitbook +# uses language-picker plugin, so need to install it +# +# npm install gitbook-plugin-language-picker +# + +html: check + gitbook build + +pdf: check + gitbook pdf + +epub: check + gitbook epub + +check: + @if [ -n "`which gitbook`" ] ; then \ + echo "gitbook found" ; \ + else \ + echo "gitbook not found, try: npm install gitbook-cli -g" ; \ + exit 1 ; \ + fi + +serve: + gitbook serve + +all: html pdf epub + +clean: + rm -rf book*.pdf *.epub _book node_modules diff --git a/userguide/README.md b/userguide/README.md new file mode 100644 index 0000000..40a4770 --- /dev/null +++ b/userguide/README.md @@ -0,0 +1,5 @@ +Docklet User Guide +================== + + + diff --git a/userguide/book.json b/userguide/book.json new file mode 100644 index 0000000..954516d --- /dev/null +++ b/userguide/book.json @@ -0,0 +1,5 @@ +{ + "plugins": [ + "language-picker" + ] +} diff --git a/userguide/cover.jpg b/userguide/cover.jpg new file mode 100644 index 0000000..dc29b0a Binary files /dev/null and b/userguide/cover.jpg differ diff --git a/userguide/cover_small.jpg b/userguide/cover_small.jpg new file mode 100644 index 0000000..578abbe Binary files /dev/null and b/userguide/cover_small.jpg differ diff --git a/userguide/en/README.md b/userguide/en/README.md new file mode 100644 index 0000000..5fb61df --- /dev/null +++ b/userguide/en/README.md @@ -0,0 +1,10 @@ +Docklet User Guide +=============== + +v0.2 + +Feb 20, 2016 + +© Software Engineering Institute, Peking University + +http://www.unias.org/trac/docklet diff --git a/userguide/en/SUMMARY.md b/userguide/en/SUMMARY.md new file mode 100644 index 0000000..8ea8e18 --- /dev/null +++ b/userguide/en/SUMMARY.md @@ -0,0 +1,37 @@ +# Summary + +* [Docklet Introduction](intro/README.md) + * [What is Docklet](intro/what.md) + * [Why to use Docklet](intro/why.md) + * [How to use Docklet](intro/how.md) +* [Basic Concepts](concepts/README.md) + * [Workspace](concepts/workspace.md) + * [vCluster](concepts/vcluster.md) + * [Image](concepts/image.md) + * [Service](concepts/service.md) + * [Dataspace](concepts/nfs.md) +* [Portal](portal/README.md) + * [Dashboard](portal/dashboard.md) + * [Config](portal/config.md) + * [Status](portal/status.md) + * [Hosts](portal/hosts.md) + * [Users](portal/users.md) + * [Admin](portal/admin.md) +* [Workspace](workspace/README.md) + * [Data Management](workspace/data.md) + * [File Editing](workspace/edit.md) + * [WEB Terminal](workspace/terminal.md) + * [Interacive Programming](workspace/notebook.md) +* [Development in WEB Terminal](software/README.md) + * [Available Tools](software/list.md) + * [C/C++](software/c.md) + * [Java](software/java.md) + * [MPI](software/mpi.md) + * [Spark](software/spark.md) + * [LaTeX](software/latex.md) +* [Interactive Computing and Analyzing](notebook/README.md) + * [Python Packages](notebook/python.md) + * [R Packages](notebook/r.md) + * [Notebook Usage](notebook/nb.md) + * [Scientific Computation](notebook/gallery.md) + diff --git a/userguide/en/concepts/README.md b/userguide/en/concepts/README.md new file mode 100644 index 0000000..71bc4c5 --- /dev/null +++ b/userguide/en/concepts/README.md @@ -0,0 +1,9 @@ +# Basic Concepts # + +Docklet has the following basic concepts: + +* [Workspace](workspace.md) +* [Vcluster](vcluster.md) +* [Image](image.md) +* [Service](service.md) +* [Dataspace](nfs.md) diff --git a/userguide/en/concepts/image.md b/userguide/en/concepts/image.md new file mode 100644 index 0000000..954d34c --- /dev/null +++ b/userguide/en/concepts/image.md @@ -0,0 +1,24 @@ +## Image ## + +The concept of image in Docklet is similar to that in [Docker](https://github.com/docker/docker), i.e., a readonly template to create containers. But there are some differences. + +Since Docker mainly focuses on LXC container in IaaS level, its images are +very free and versatile. Users can customize and make their own images +and then upload them to the docker hub. Usually one Docker image has +only the environment of one specific application. For example, a Docker +image may have a full Ubuntu operating environment, in which there is +only the Apache web server software. + +Docklet is a bit different. Docklet focuses on **Workspace** and +**vcluster**, therefore has special restrictions about images, not +supporting uploading user-made images. The Docklet administrator will +make one or more public base image according to the common requirements +of the enterprise, containing the needed software packages to support +Workspace, together will some programming tools and frameworks such as +MPI, Spark, Python and R. Since the Docklet public base images are +shared by all users, therefore is usually larger than typical Docker +containers. + +If users can not find needed software tools in the public base image, +they can install them in their Workspace. They can then save their +Workspace as private image, and can choose to share it with others. diff --git a/userguide/en/concepts/nfs.md b/userguide/en/concepts/nfs.md new file mode 100644 index 0000000..641fedd --- /dev/null +++ b/userguide/en/concepts/nfs.md @@ -0,0 +1,19 @@ +## Dataspace ## + +In Docklet, each user has their own persistent Dataspace, in which to +save data files. Dataspaces are isolated from each other, each user can +only visit their own Dataspace. + +User's Dataspace is shared among all nodes of their vcluster. It is +located in the `~/nfs` directory of the container node. + +When the user save their Workspace as image, the Dataspace will be +excluded. + +The Dataspace is supported by distributed file system in the backend. +Except the Dataspace, all data in containers will be destroyed as +the containers are deleted. But the data in Dataspace will be persistent +regardless of the lifecycle of containers. + +Users are encouraged to download and backup their important data in +time. diff --git a/userguide/en/concepts/service.md b/userguide/en/concepts/service.md new file mode 100644 index 0000000..352f451 --- /dev/null +++ b/userguide/en/concepts/service.md @@ -0,0 +1,10 @@ +## Service ## + +Many sofware tools in Docklet image can be run directly as script in the +terminal. But there are also some software frameworks need to run as +service, e.g., [Spark](http://spark.apache.org). When users create a +Workspace, those frameworks are not started in default, to reduce +resource consumption. Users may configure their Docklet vcluser, to +allow starting specified software frameworks. + +Docklet can be configured to start user-installed software services. diff --git a/userguide/en/concepts/vcluster.md b/userguide/en/concepts/vcluster.md new file mode 100644 index 0000000..b00ba79 --- /dev/null +++ b/userguide/en/concepts/vcluster.md @@ -0,0 +1,26 @@ +## vcluster ## + +Vcluster(virtual cluster) is the basic concept in Docklet. A vcluster +consists of at least one LXC container nodes, each node can be regarded +as a lightweight Linux host. Usually, the container nodes of one +vcluster are generated from the same image, which can simplify the +management of vcluster. + +The level of the user determines the quota of his vcluster, including +the maximum number of nodes the vcluster can have, the CPU and memory +config of each container node, etc. The default vcluster has one node. + +The nodes of Docklet vcluster are designed to distribute to different +physical hosts as most as possible, in order to better utilize physical +resources. Those distributed nodes can communicate with each other. One +docklet vcluster will use an independent virtual sub-network, usually +the first node is the gateway of this sub-network. + +In most cases, users do not need to care about the details of vcluster, +but only need to focus on their Workspace. + +In some special cases, users may need to configure the vcluster to meet +the requirements of upper Workspace. For example, in order to run +program parallelly in Workspace, users need to scale out the default +vcluster, adding more nodes. Docklet vcluster has the ability of elastic +scaling, supporting adding or removing nodes at any time. diff --git a/userguide/en/concepts/workspace.md b/userguide/en/concepts/workspace.md new file mode 100644 index 0000000..9ff8773 --- /dev/null +++ b/userguide/en/concepts/workspace.md @@ -0,0 +1,23 @@ +## Workspace ## + +Workspace is the core concept in Docklet. It represents a WEB based +interactive working environment. The Docklet Workspace is provided by an +open source component [Jupyter Notebook](https://jupyter.org), by which +users can create and share documents that contain live code, equations, +visualizations and explanatory text. It can be used in the following +area: data cleaning and transformation, numerical simulation, +statistical modeling, machine learning and much more. + +For most users, they do not care about the low level details of physical +hosts, clusters, containers or vclusters. They only need to open a +browser to enter their personal Workspace, then interactively do +their jobs, even including visualizations in the browser. Currently the +Jupyter Notebook supports more than 40 programming languages. + +The following is a snapshop of [Jupyter Notebook](https://jupyter.org) Workspace: + +jupyter workspace + +For advanced users, Jupyter Notebook offers WEB Terminal to accomplish +all most all non-visualization works. + diff --git a/userguide/en/images/config-image.png b/userguide/en/images/config-image.png new file mode 100644 index 0000000..61499a2 Binary files /dev/null and b/userguide/en/images/config-image.png differ diff --git a/userguide/en/images/config-workspace.png b/userguide/en/images/config-workspace.png new file mode 100644 index 0000000..5de53f5 Binary files /dev/null and b/userguide/en/images/config-workspace.png differ diff --git a/userguide/en/images/dashboard-start.png b/userguide/en/images/dashboard-start.png new file mode 100644 index 0000000..68f8549 Binary files /dev/null and b/userguide/en/images/dashboard-start.png differ diff --git a/userguide/en/images/dashboard-stop.png b/userguide/en/images/dashboard-stop.png new file mode 100644 index 0000000..0488400 Binary files /dev/null and b/userguide/en/images/dashboard-stop.png differ diff --git a/userguide/en/images/docklet-architecture.jpg b/userguide/en/images/docklet-architecture.jpg new file mode 100644 index 0000000..54d6476 Binary files /dev/null and b/userguide/en/images/docklet-architecture.jpg differ diff --git a/userguide/en/images/hosts-summary-config.png b/userguide/en/images/hosts-summary-config.png new file mode 100644 index 0000000..706334d Binary files /dev/null and b/userguide/en/images/hosts-summary-config.png differ diff --git a/userguide/en/images/hosts-summary-perf.png b/userguide/en/images/hosts-summary-perf.png new file mode 100644 index 0000000..b47a55e Binary files /dev/null and b/userguide/en/images/hosts-summary-perf.png differ diff --git a/userguide/en/images/hosts.png b/userguide/en/images/hosts.png new file mode 100644 index 0000000..3bc6eb5 Binary files /dev/null and b/userguide/en/images/hosts.png differ diff --git a/userguide/en/images/jupyter-notebook-dashboard.png b/userguide/en/images/jupyter-notebook-dashboard.png new file mode 100644 index 0000000..dee4271 Binary files /dev/null and b/userguide/en/images/jupyter-notebook-dashboard.png differ diff --git a/userguide/en/images/jupyterpreview.png b/userguide/en/images/jupyterpreview.png new file mode 100644 index 0000000..e9588ca Binary files /dev/null and b/userguide/en/images/jupyterpreview.png differ diff --git a/userguide/en/images/sklearn1.png b/userguide/en/images/sklearn1.png new file mode 100644 index 0000000..83b1ab0 Binary files /dev/null and b/userguide/en/images/sklearn1.png differ diff --git a/userguide/en/images/status-summary.png b/userguide/en/images/status-summary.png new file mode 100644 index 0000000..2681e66 Binary files /dev/null and b/userguide/en/images/status-summary.png differ diff --git a/userguide/en/images/status.png b/userguide/en/images/status.png new file mode 100644 index 0000000..2ba4f8c Binary files /dev/null and b/userguide/en/images/status.png differ diff --git a/userguide/en/images/user_init.png b/userguide/en/images/user_init.png new file mode 100644 index 0000000..44139ab Binary files /dev/null and b/userguide/en/images/user_init.png differ diff --git a/userguide/en/images/workspace-create.png b/userguide/en/images/workspace-create.png new file mode 100644 index 0000000..519fd2b Binary files /dev/null and b/userguide/en/images/workspace-create.png differ diff --git a/userguide/en/intro/README.md b/userguide/en/intro/README.md new file mode 100644 index 0000000..32dfeba --- /dev/null +++ b/userguide/en/intro/README.md @@ -0,0 +1,14 @@ +# Docklet Introduction # + +Welome to the world of Docklet! + +What is Docklet? + +Does it have any relationship with the famous Docker project? + +What advantages it has? + +How to use Docklet? + +This section will answer those questions for you first. + diff --git a/userguide/en/intro/how.md b/userguide/en/intro/how.md new file mode 100644 index 0000000..b6926cc --- /dev/null +++ b/userguide/en/intro/how.md @@ -0,0 +1,14 @@ +## How to use Docklet ## + +For Docklet users to access their Workspace, what they need is only a +modern web browser, such as Firefox, Chrome or Safari. + + +Docklet administrator will configure the authentication method. +For example, the instance deployed at the Computing Center in Peking +University, http://iwork.internetware.org , supports authenticating both +Peking University accounts and Docklet internal accounts. + +Docklet also supports Guest account. Users can try the "Workspace" using +the guest account with some limitations. The Guest Workspace will +restart periodically. diff --git a/userguide/en/intro/what.md b/userguide/en/intro/what.md new file mode 100644 index 0000000..b168fe9 --- /dev/null +++ b/userguide/en/intro/what.md @@ -0,0 +1,53 @@ +## What is Docklet ## + +### Definition and Architecture ### + +Docklet is an open source project from Software Engineering Institute, +Peking University. It is released uner the [New BSD License](http://directory.fsf.org/wiki/License:BSD_3Clause). Its source code is hosted in GitHub. + +The goal of Docklet is to provide a solution of **Personal Development +Workspace in the Cloud** for enterprises. With the help of Docklet, +enterprises can easily virtualize their small-scale data centers, +creating virtualized clusters for their staff users, and then providing +users a customizable **Workspace** in the cloud. Users only need a +modern **browser** to visit their own **Workspace** located in the +enterprise's Intranet from anywhere, at anytime. They can do works like +online editing source codes, debugging, testing, managing data files, +analyzing data, visualizing results, etc. + + +The kernel technology of Docklet is LXC virtual cluster (vcluster) that +is based on Linux Container (LXC) and software defined networks +technologies. The Docklet **vcluster** provides a virtualization +solution of physical clusters, supports the installation and running of +most existing Linux applications. + +Based on vcluster, Docklet provides users their **Personal Workspace**. +Users visit their Workspace throught the integrated [Jupyper Notebook](https://github.com/jupyter/notebook) to do most development tasks. + +The architecture of Docklet is illustrated as follows: + + + + + +### Docklet and Docker ### + +Docklet has no direct relationship with [Docker](https://github.com/docker/docker). They are two different projects, with totally different design goals. + +The goal of Docker is to provide a lightweight operating system +virualization solution. It is in the [IaaS layer](https://en.wikipedia.org/wiki/Cloud_computing) of cloud computing architecture. +Docker encapsulates LXC to make LXC easy, +so that [the users can operate a Docker container as simple as operate +a fast and lightweight virtual machine](https://www.gitbook.com/book/yeasy/docker_practice). Although later on Docker has clustering component, +it in essence still focuses on single node operating system layer. + +The goal of Docklet is to provide **Personal Development Workspace in +the Cloud** solution. It covers all the [SaaS, PaaS and IaaS layer](https://en.wikipedia.org/wiki/Cloud_computing) of cloud computing architecture. +The basic of Docklet is LXC vcluster, but not Docker container. For +Docklet users, what they face directly is their Workspace. They use +browser to do software development, debuging and testing, etc, using tools +Docklet provide, working in a high layer. diff --git a/userguide/en/intro/why.md b/userguide/en/intro/why.md new file mode 100644 index 0000000..46c185d --- /dev/null +++ b/userguide/en/intro/why.md @@ -0,0 +1,104 @@ +## Why to use Docklet ## + +As a "**Workspace Cloud**", Docklet has several advantages. + +First, what Docklet provides is a virtual workspace for developers, but +no only an Iaas or PaaS solution, therefore more friendly to deveopers. + +Second, the Docklet vcluster provides a high effient and simple method +of sharing physical resources, therefore more friendly to +administrators. + +The detailed advantages and features of Docklet include: + +### Wide applicability ### + +Docklet is not designed for large data center, but for small or **mini +data center**. By **mini data center**, we mean a cluster with not more than 100 +physical hosts, all connected by high speed network, each host with +relatively high performance. This is the common configuration of most +medium or small enterprises and institutes. Therefore, Docklet has a +wide applicability. + +### Easily Sharing Cluster among Frameworks ### + +Traditionally clusters are often monopolized, serving only one kind of +application framework, e.g., MPI or Spark. It is usually difficult to +share the physcal cluster among different frameworks. + +[Mesos](http://mesos.apache.org) and +[Yarn](https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html) each try to provide a mechanism of sharing clusters. However, they require extra work of customization for each supported framework, which seems not flexible and a bit difficult. + +Docklet provides a flexible and easy solution. By virtualizing the +physical clusters, almost all computing frameworks can run in the +LXC vcluster seamlessly without any customization and modification. + +### Dynamically and Elastically Configure Resources ### + +Docklet vcluster supports elastically configuring resources. Users can +add a fully functional LXC node to existing vcluster instantly by just a +click. This feature is useful for the application frameworks that +support elastic resource management, e.g., Spark. + +Docklet can monitor the activity of users, smartly finding those not at work, +then adjust their resource quota dynamically. This can bring about good +usage experience for online active users, together with reasonable +resource utilization. + +### Consuming Very Few Network Resources ### + +Docklet utilizes the technology of software defined networks. +The inner vclusters will use private Internet address space, e.g., +`172.16/16`, not consuming public Internet addresses. The administrator +only need to configure **one** public Internet address/URL for Docklet +portal. Users use just a modern browser to access their private +vcluster located behind the enterprise's firewall through the portal's +public URL. This feature is very attractive for those enterprises +without so many public Internet addresses. + +### User Friendly ### + +Docklet is mainly targeted for enterprises's staff developers. It is +domain oriented, therefore can be setup to meet the enterprise's +requirements deliberately. The public base image has most often used +software. Users can directly start their programming and analyzing job +with just a browser, without the heavy burden of installing and +configuring the physical clusters, installation of software frameworks, +management of images, etc. Therefore, it is very friently to end users, +which is especially important for those non-export users. + +### Ideal Environment for Application Development and Testing ### + +The easy usage of Docklet means it can be an ideal environment for +application development and testing. For example, users can program and +debug a big data analyzing software with a small data set (possibly +Gigabye scale). If success, then deploy it to production system with +true big data. In fact, for most data analyzing and mining jobs, their +data set is usually not so large, which means they can directly run and +output the final results in Docklet. + +Docklet is also a good platform for exercising programming skills and +studying programming languages. No need to install kinds of tools, just +a browser is OK. + +### Easy to Customize and Extend ### + +The enterprise's Docklet administrator is free to make specific public base +images. The users can also install new software not in the base image. +What is more, the users can save their Workspace as image and then share +with others, which is very convenient for users of one group. + +### Supporting Multiple Users ### + +Docklet vcluster naturally supports multiple users. Each user has their +own private vcluster independenlty. Each vcluster can run different +frameworks such as MPI and Spark without worry about interference. +Therefore, it can satisfy the different requirements of different users. + +### Advantages of LXC ### + +Docklet uses LXC technology. Therefore, it has the advantages of LXC and +LXC-based solutions like Docker, including instant booting container in seconds; +high utilization of resources, running thousands of containers in one +physical host; high performance of applications; consuming less +resources; fast delivery; simple management, etc. diff --git a/userguide/en/notebook/README.md b/userguide/en/notebook/README.md new file mode 100644 index 0000000..f59d044 --- /dev/null +++ b/userguide/en/notebook/README.md @@ -0,0 +1,39 @@ +# Scientific Computation and Data Analysis # + +Docklet supports scientific computation and data analysis mainly by two languages [Python](https://www.python.org) and [R](http://www.r-project.org), as well as plentiful packages. +Most work can be done in Jupyter Notebook, including data visualization, +with only a browser. + +There are many articles in the WEB about how to use Pyton and R do big +data processing, scientific computing, machine learning and deep +learning, natural language processing, statistics, data mining, graph +processing, data visualization, etc. In these articles, what is valuable +to novice users is what packages can be used in what kind of tasks. + +[ANACONDA](https://www.continuum.io/why-anaconda) from +[CONTINUUM](https://www.continuum.io) is a computation and analysis +platform. It integrates both useful open source Python packages and R packages. + +The [Canopy](https://www.enthought.com/products/canopy/) from +[ENTHOUGHT](https://www.enthought.com) is a platform similar to CONTINUUM. +It also integrates many useful Python packages for computation and +analysis. + +[RStudio](https://www.rstudio.com) introduces [RStudio +IDE](https://www.rstudio.com/products/rstudio2/) and several +[useful R packages](https://www.rstudio.com/products/rpackages/). + +Though many visualization and IDE packages need to run in local +OS, most packages in Docklet can run in Jupyter Notebook as Web app. +Users can use `pip` or other tools to install those packages needed. + +The list of Python and R packages in Docklet: + + * [Python Packages](python.md) + * [R Packages](r.md) + +About the tutorials and examples about using Jupyter Notebook in +activities like scientific computation, please referer +[nbview](http://nbviewer.jupyter.og) and [ipython +wiki](https://github.com/ipython/ipython/wiki/A-gallery-of-interesting-IPython-Notebooks). + diff --git a/userguide/en/notebook/gallery.md b/userguide/en/notebook/gallery.md new file mode 100644 index 0000000..d6674d5 --- /dev/null +++ b/userguide/en/notebook/gallery.md @@ -0,0 +1,588 @@ +## Gallery of Some Interesting Notebooks## + +https://github.com/ipython/ipython/wiki/A-gallery-of-interesting-IPython-Notebooks shows a gallery of interesting notebooks in various situations. +Some notebooks require specific packages. Docklet has already preinstalled many +ofen-used [Python3](python.md) and +[R](r.md) packages. If the users find some packages not in Docklet, please install them manually, through `pip3`, R shell, etc. + +1. [Entire books or other large collections of notebooks on a topic](#entire-books-or-other-large-collections-of-notebooks-on-a-topic) + * [Introductory Tutorials](#introductory-tutorials) + * [Programming and Computer Science](#programming-and-computer-science) + * [Statistics, Machine Learning and Data Science](#statistics-machine-learning-and-data-science) + * [Mathematics, Physics, Chemistry, Biology](#mathematics-physics-chemistry-biology) + * [Earth Science and Geo-Spatial data](#earth-science-and-geo-spatial-data) + * [Linguistics and Text Mining](#linguistics-and-text-mining) + * [Signal Processing](#signal-processing) + * [Engineering Education](#engineering-education) +1. [Scientific computing and data analysis with the SciPy Stack](#scientific-computing-and-data-analysis-with-the-scipy-stack) + * [General topics in scientific computing](#general-topics-in-scientific-computing) + * [Social data](#social-data) + * [Psychology and Neuroscience](#psychology-and-neuroscience) + * [Machine Learning, Statistics and Probability](#machine-learning-statistics-and-probability) + * [Physics, Chemistry and Biology](#physics-chemistry-and-biology) + * [Economics and Finance](#economics-and-finance) + * [Earth science and geo-spatial data](#earth-science-and-geo-spatial-data) + * [Data visualization and plotting](#data-visualization-and-plotting) + * [Mathematics](#mathematics) + * [Signal and Sound Processing](#signal-and-sound-processing) + * [Natural Language Processing](#natural-language-processing) + * [Pandas for data analysis](#pandas-for-data-analysis) +1. [General Python Programming](#general-python-programming) +1. [Notebooks in languages other than Python](#notebooks-in-languages-other-than-python) + * [Julia](#julia) + * [Haskell](#haskell) + * [Ruby](#ruby) + * [Perl](#perl) +1. [Miscellaneous topics about doing various things *with* the Notebook itself](#miscellaneous-topics-about-doing-various-things-with-the-notebook-itself) +1. [Reproducible academic publications](#reproducible-academic-publications) +1. [Other publications using the Notebook](#other-publications-and-conference-abstracts-that-explicitly-use-the-notebook) +1. [Data-driven journalism](#data-driven-journalism) +1. [Whimsical notebooks](#whimsical-notebooks) +1. [Videos of IPython being used in the wild](#videos-of-ipython-being-used-in-the-wild) + +## Entire books or other large collections of notebooks on a topic + +### Introductory Tutorials + +* First things first, how to [run code in the notebook](https://nbviewer.jupyter.org/github/jupyter/notebook/blob/master/docs/source/examples/Notebook/Running%20Code.ipynb). There is also a general [collection of notebooks](https://nbviewer.jupyter.org/github/ipython/ipython/blob/master/examples/IPython%20Kernel/Index.ipynb) from IPython. Another useful one from this collection is an explanation of our [rich display system](https://nbviewer.jupyter.org/github/ipython/ipython/blob/master/examples/IPython%20Kernel/Custom%20Display%20Logic.ipynb). + +* A [great matplotlib tutorial](http://nbviewer.ipython.org/github/jrjohansson/scientific-python-lectures/blob/master/Lecture-4-Matplotlib.ipynb), part of the fantastic [Lectures on Scientific Computing with Python](http://nbviewer.ipython.org/github/jrjohansson/scientific-python-lectures/tree/master) by [J.R. Johansson](https://github.com/jrjohansson). + +* The code of the [IPython mini-book](https://github.com/rossant/ipython-minibook) by C. Rossant, introducing IPython, NumPy, SciPy, Pandas and matplotlib for interactive computing and data visualization. + +* [Python Tutorial](https://github.com/rajathkumarmp/Python-Lectures) by [Rajath Kumar M P](https://github.com/rajathkumarmp) + +### Programming and Computer Science + +* [Introduction to Programming (using Python)](http://nbviewer.ipython.org/github/ehmatthes/intro_programming/blob/master/notebooks/index.ipynb), an entire introductory Python course written by [Eric Matthes](http://peak5390.wordpress.com/about). [This post](http://peak5390.wordpress.com/2013/09/22/how-ipython-notebook-and-github-have-changed-the-way-i-teach-python/) explains the educational context in an Alaskan high school where Eric is a teacher. + +* [Python for Developers](http://ricardoduarte.github.io/python-for-developers), a complete book on Python programming by [Ricardo Duarte](https://github.com/ricardoduarte). Note the book also exists [in Portuguese](http://ricardoduarte.github.io/python-para-desenvolvedores). + +* [CS1001.py - Extended Introduction to Computer Science](https://github.com/yoavram/CS1001.py). Recitations from Tel-Aviv University introductory course to computer science, assembled as IPython notebooks by [Yoav Ram](http://www.yoavram.com/). + +* [Exploratory Computing with Python](http://mbakker7.github.io/exploratory_computing_with_python/), a set of 15 Notebooks that cover exploratory computing, data analysis, and visualization. No prior programming knowledge required. Each Notebook includes a number of exercises (with answers) that should take less than 4 hours to complete. Developed by Mark Bakker for undergraduate engineering students at the Delft University of Technology. + +* [Understanding evolutionary strategies and covariance matrix adaptation](http://nbviewer.ipython.org/github/lmarti/evolutionary-computation-course/blob/master/AEC.04%20-%20Evolutionary%20Strategies%20and%20Covariance%20Matrix%20Adaptation.ipynb), from the [Advanced Evolutionary Computation: Theory and Practice](http://nbviewer.ipython.org/github/lmarti/evolutionary-computation-course/tree/master/) course by [Luis Martí](http://lmarti.com). + +### Statistics, Machine Learning and Data Science + +* [AM207: Monte Carlo Methods, Stochastic Optimization](http://am207.org): a complete course by Verena Kaynig-Fittkau and Pavlos Protopapas from Harvard, with all lecture materials and homework sets as notebooks. + +* [An introduction to Bayesian inference](http://nbviewer.ipython.org/github/CamDavidsonPilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers/blob/master/Chapter1_Introduction/Chapter1.ipynb), this is just chapter 1 in an ongoing book titled [Probabilistic Programming and Bayesian Methods for Hackers Using Python and PyMC](http://camdavidsonpilon.github.io/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers), by [Cameron Davidson-Pilon](http://camdp.com/). + +* [Learn Data Science](http://learnds.com), an entire self-directed course by [Nitin Borwankar](https://github.com/nborwankar). + +* [IPython Cookbook](http://ipython-books.github.io/cookbook/) by [Cyrille Rossant](http://cyrille.rossant.net/), a comprehensive guide to Python for Data Science. The code of the 100 recipes is available on [the GitHub repository](https://github.com/ipython-books/cookbook-code). + +* [An introduction to machine learning with Python and scikit-learn](http://nbviewer.ipython.org/github/temporaer/tutorial_ml_gkbionics/blob/master/2%20-%20KMeans.ipynb) ([repo and overview](https://github.com/amueller/tutorial_ml_gkbionics)) by [Hannes Schulz](https://github.com/temporaer) and [Andreas Mueller](https://github.com/amueller). + +* [Clustering and Regression](http://nbviewer.ipython.org/github/amplab/datascience-sp14/blob/master/hw2/HW2.ipynb), part of the UC Berkeley 2014 [Introduction to Data Science](http://amplab.github.io/datascience-sp14/) course taught by [Michael Franklin](http://www.cs.berkeley.edu/~franklin/). + +* [Neural Networks](http://nbviewer.ipython.org/github/masinoa/machine_learning/blob/master/04_Neural_Networks.ipynb), part of a collection on [machine learning](https://github.com/masinoa/machine_learning) by [Aaron Masino](https://github.com/masinoa). + +* [An introduction to Pandas](http://nbviewer.ipython.org/urls/bitbucket.org/hrojas/learn-pandas/raw/master/lessons/01%20-%20Lesson.ipynb), part of an [11-lesson tutorial on Pandas](https://bitbucket.org/hrojas/learn-pandas), by [Hernán Rojas](https://bitbucket.org/hrojas). + +* The [Statsmodels Project](http://statsmodels.sourceforge.net) has two excellent collections of examples: [in their official documentation](http://statsmodels.sourceforge.net/devel/examples/index.html) and [extra ones in their wiki](https://github.com/statsmodels/statsmodels/wiki/Examples#user-contributed-examples). Too many there to directly duplicate here, but they provide great learning materials on statistical modeling with Python. + +* [Machine Learning with the Shogun Toolbox](http://cloud.shogun-toolbox.org). This is a complete *service* that includes a ready-to-run IPython instance with a collection of notebooks illustrating the use of the [Shogun Toolbox](http://shogun-toolbox.org). Just log in and start running the examples. + +* [Python for Data Analysis](https://github.com/ResearchComputing/Meetup-Fall-2013), an introductory collection from the [CU Boulder Research Computing Group](http://researchcomputing.github.io/). + +* [The Kaggle bulldozers competition example](http://danielfrg.github.io/blog/2013/03/07/kaggle-bulldozers-basic-cleaning), one of a set on tutorials on exploratory data analysis with the [copper toolkit](https://github.com/danielfrg/copper#copper) by [Daniel Rodríguez](http://danielfrg.github.io)/ + +* [Understanding model reliability](http://nbviewer.ipython.org/github/mwaskom/Psych216/blob/master/week6_tutorial.ipynb), part of a complete [course on statistics and data analysis for psychologists](https://github.com/mwaskom/Psych216) by [Michael Waskom](https://github.com/mwaskom). + +* [Graphical Representations of Linear Models](http://nbviewer.ipython.org/github/mwaskom/seaborn/blob/master/examples/linear_models.ipynb), an illustration of the [Seaborn statistical visualization library](https://github.com/mwaskom/seaborn), that also includes [Visualizing distributions of data](http://nbviewer.ipython.org/github/mwaskom/seaborn/blob/master/examples/plotting_distributions.ipynb) and [Representing variability in timeseries plots](http://nbviewer.ipython.org/github/mwaskom/seaborn/blob/master/examples/timeseries_plots.ipynb). By [Michael Waskom](https://github.com/mwaskom). + +* [Desperately Seeking Silver](http://nbviewer.ipython.org/github/cs109/content/blob/master/HW2.ipynb), one of the homework sets for Harvard's [CS 109 Data Science course](http://cs109.org). + +* [IPython Notebooks for 'An Introduction to Statistical Learning with Applications in R'](https://github.com/JWarmenhoven/ISLR-python), Python code for a selection of tables, figures and LAB sections from the book by James, Witten, Hastie, Tibshirani (2013). + +* [Python Notebooks for StatLearning Exercises](https://github.com/sujitpal/statlearning-notebooks), Python implementations of the R labs for the [StatLearning: Statistical Learning](https://class.stanford.edu/courses/HumanitiesScience/StatLearning/Winter2014/about) online course from Stanford University taught by Profs Trevor Hastie and Rob Tibshirani. + +* [Applied Predictive Modeling with Python](http://nbviewer.ipython.org/github/leig/Applied-Predictive-Modeling-with-Python/tree/master/notebooks/), Python implementations of the examples (originally written in R) from a famous introductory book, [Applied Predictive Modeling](http://appliedpredictivemodeling.com), by Max Kuhn and Kjell Johnson. + +* A collection of [four courses in foundations of data science, algorithms and databases](http://nbviewer.ipython.org/github/ledeprogram/courses/tree/master) from multiple faculty at [Columbia University's Lede Program](http://www.journalism.columbia.edu/page/1058-the-lede-program-an-introduction-to-data-practices/906). + +* [SciPy and OpenCV as an interactive computing environment for computer vision](http://nbviewer.ipython.org/github/thsant/scipy4cv) by [Thiago Santos](http://ttsantos.net), a tutorial presented at [SIBGRAPI 2014](http://emap.fgv.br/sibgrapi-2014/tutorials.html). + +* [Kalman and Bayesian Filters in Python](http://nbviewer.ipython.org/github/rlabbe/Kalman-and-Bayesian-Filters-in-Python/blob/master/table_of_contents.ipynb), by [Roger Labbe](https://github.com/rlabbe). + +* [Machine learning in Python](http://nbviewer.ipython.org/github/jdwittenauer/ipython-notebooks/blob/master/ML-Exercise1.ipynb), a series based on Andrew Ng's Coursera class on machine learning. Part of a [larger collection of data science notebooks](https://github.com/jdwittenauer/ipython-notebooks) by [John Wittenauer](https://github.com/jdwittenauer). + +* [An example machine learning notebook](http://nbviewer.jupyter.org/github/rhiever/Data-Analysis-and-Machine-Learning-Projects/blob/master/example-data-science-notebook/Example%20Machine%20Learning%20Notebook.ipynb), by [Randal. S. Olson](http://www.randalolson.com/), part of a [collection in Data Analysis and Machine Learning](http://nbviewer.jupyter.org/github/rhiever/Data-Analysis-and-Machine-Learning-Projects). + +### Mathematics, Physics, Chemistry, Biology + +* A [single-atom laser model](http://nbviewer.ipython.org/github/jrjohansson/qutip-lectures/blob/master/Lecture-2B-Single-Atom-Lasing.ipynb). This is one of a complete set of [lectures on quantum mechanics and quantum optics using QuTiP](http://nbviewer.ipython.org/github/jrjohansson/qutip-lectures) by [J.R. Johansson](https://github.com/jrjohansson). + +* [2-d rigid-body transformations](http://nbviewer.ipython.org/github/demotu/BMC/blob/master/notebooks/Transformation2D.ipynb). This is part of [Scientific Computing in Biomechanics and Motor Control](https://github.com/demotu/BMC), a complete collection of notebooks by [Marcos Duarte](https://github.com/demotu). + +* Astrophysical simulations and analysis with [yt](http://yt-project.org): a collection of example notebooks on using various codes that yt interfaces with: [Enzo](http://nbviewer.ipython.org/url/hub.yt-project.org/notebooks/ac275ee8a462425c93b36e330e243705.ipynb), [Gadget](http://nbviewer.ipython.org/url/hub.yt-project.org/notebooks/e209c55b6aaa4a9ab12c55422bb3afdc.ipynb), [RAMSES](http://nbviewer.ipython.org/url/hub.yt-project.org/notebooks/9fe5b4bc68ce48d5aa665edbc4ec3aa3.ipynb), [PKDGrav](http://nbviewer.ipython.org/url/hub.yt-project.org/notebooks/9fe5b4bc68ce48d5aa665edbc4ec3aa3.ipynb) and [Gasoline](http://nbviewer.ipython.org/url/hub.yt-project.org/notebooks/9fe5b4bc68ce48d5aa665edbc4ec3aa3.ipynb). *Note:* the yt site currently throws an SSL warning, they seem to have an outdated or self-signed certificate. + +* [Working with Reactions](http://nbviewer.ipython.org/gist/4316430), part of a set of tutorials on [cheminformatics and machine learning with the rdkit project](http://code.google.com/p/rdkit/wiki/UGM2012Tutorials), by Greg Landrum. + +* [CFD Python: 12 steps to Navier-Stokes](http://lorenabarba.com/blog/cfd-python-12-steps-to-navier-stokes). A complete set of lectures on Computational Fluid Dynamics, from 1-d linear waves to full 2-d Navier-Stokes, by [Lorena Barba](http://lorenabarba.com). + +* [AeroPython: Aerodynamics-Hydrodynamics with Python](https://github.com/barbagroup/AeroPython), a complete course taught at George Washington University by [Lorena Barba](http://lorenabarba.com). + +* [Practical Numerical Methods with Python](https://github.com/numerical-mooc/numerical-mooc), a collection of learning modules (each consisting of several IPython Notebooks) for a course in numerical differential equations taught at George Washington University by [Lorena Barba](http://lorenabarba.com). Also offered as a "massive, open online course" (MOOC) on the [GW SEAS Open edX](http://openedx.seas.gwu.edu/courses/GW/MAE6286/2014_fall/about) platform. + +* [pyuvvis: tools for explorative spectroscopy](https://github.com/hugadams/pyuvvis), spectroscopy library built for integration ipython notebooks, matplotlib and pandas. + +* [HyperPython: a practical introduction to the solution of hyperbolic conservation laws](http://nbviewer.ipython.org/github/ketch/HyperPython/tree/master/), a course by [David Ketcheson](http://davidketcheson.info). + +* [An Introduction to Applied Bioinformatics](http://caporasolab.us/An-Introduction-To-Applied-Bioinformatics): Interactive lessons in bioinformatics, by [Greg Caporaso](http://caporasolab.us). + +* Colour science computations with [colour](https://github.com/colour-science/colour), a Python package implementing a comprehensive number of colour theory transformations and algorithms supported by a [dedicated collection of IPython Notebooks](http://nbviewer.ipython.org/github/colour-science/colour-ipython/blob/master/notebooks/colour.ipynb). More colour science related [IPython Notebooks](http://nbviewer.ipython.org/github/colour-science/colour-website/tree/master/ipython/) are available on [colour-science.org](http://colour-science.org/). + +* The [notebooks](https://github.com/tiagoantao/bioinf-python/blob/master/notebooks/Welcome.ipynb) from the Book [Bioinformatics with Python Cookbook](http://www.amazon.com/Bioinformatics-Python-Cookbook-Tiago-Antao/dp/1782175113), covering several fields like Next-Generation Sequencing, Population Genetics, Phylogenetics, Genomics, Proteomics and Geo-referenced information. + +### Earth Science and Geo-Spatial data + +* [EarthPy](http://earthpy.org), a collection of IPython notebooks with a focus on Earth Sciences, from [whale tracks](http://earthpy.org/analyzing-whale-tracks.html) to the [flow of the Amazon](http://earthpy.org/earthpy-basemap-amazon.html). + +* [Python for Geosciences](https://github.com/koldunovn/python_for_geosciences), a tutorial series aimed at the Earth Sciences community, by [Nikolay Koldunov](http://earthpy.org/author/nikolay-koldunov.html). + +* [Find graffiti close to NY subway entrances](http://nbviewer.ipython.org/github/invisibleroads/crosscompute-tutorials/blob/master/computational-analysis/300%20Count%20graffiti%20sightings%20within%20100%20feet%20of%20a%20subway%20entrance.ipynb), one of a rich [collection of notebooks on large-scale data analysis](https://github.com/invisibleroads/crosscompute-tutorials), by [Roy Hyunjin Han](https://github.com/invisibleroads). + +* [Logistic models of well switching in Bangladesh](http://nbviewer.ipython.org/github/carljv/Will_it_Python/blob/master/ARM/ch5/arsenic_wells_switching.ipynb), part of the ["Will it Python"](http://slendrmeans.wordpress.com/will-it-python) blog series ([repo](https://github.com/carljv/Will_it_Python)) on Machine Learning and data analysis in Python. By Carl Vogel. + +* [Estimated likelihood of observing a large earthquake on a continental low‐angle normal fault and implications for low‐angle normal fault activity](http://nbviewer.ipython.org/github/cossatot/lanf_earthquake_likelihood/blob/master/notebooks/lanf_manuscript_notebook.ipynb), an executable version of a paper by Richard Styron and Eric Hetland published in *Geophysical Research Letters*, on earthquake probabilities. + +* [python4oceanographers](https://ocefpaf.github.io/python4oceanographers/), a blog demonstrating analyses in physical oceanography from [resource-demanding numerical computations with functions in compiled languages](https://ocefpaf.github.io/python4oceanographers/blog/2015/10/05/isosurfaces/) to specialized [tidal analysis](https://ocefpaf.github.io/python4oceanographers/blog/2015/02/16/utide/) to [visualization of various geo data using fancy things like interactive maps](https://ocefpaf.github.io/python4oceanographers/blog/2015/07/13/interactive_geo/). + +### Linguistics and Text Mining + +* [Workshop on text analysis](http://nbviewer.ipython.org/github/nealcaren/workshop_2014/tree/master/notebooks/) by [Neal Caren](http://nealcaren.web.unc.edu). + +* [Detecting Algorithmically Generated Domains](http://nbviewer.ipython.org/github/ClickSecurity/data_hacking/blob/master/dga_detection/DGA_Domain_Detection.ipynb), part of the [Data Hacking](http://clicksecurity.github.io/data_hacking) collection on security-oriented data analysis with IPython & friends. + +* [Mining the Social Web (2nd Edition)](https://github.com/ptwobrussell/Mining-the-Social-Web-2nd-Edition). A complete collection of notebooks accompanying [Matthew Russel's book](http://shop.oreilly.com/product/0636920030195.do) by O'Reilly. + +### Signal Processing + +* [Sound Analysis with the Fourier Transform](https://github.com/calebmadrigal/FourierTalkOSCON). A set of IPython Notebooks by [Caleb Madrigal](http://calebmadrigal.com) to explain what the Fourier Transform is and how to use it for basic audio processing applications. + +* [An introduction to Compressed Sensing](http://nbviewer.ipython.org/github/unpingco/Python-for-Signal-Processing/blob/master/Compressive_Sampling.ipynb), part of [Python for Signal Processing](http://nbviewer.ipython.org/github/unpingco/Python-for-Signal-Processing): an entire book (and [blog](http://python-for-signal-processing.blogspot.com)) on the subject by Jose Unpingco. + +* [Kalman and Bayesian Filters in Python](http://nbviewer.ipython.org/github/rlabbe/Kalman-and-Bayesian-Filters-in-Python/blob/master/table_of_contents.ipynb). A textbook and accompanying filtering library on the topic of Kalman filtering and other related Bayesian filtering techniques. + +* [Classify human movements using Dynamic Time Warping & K Nearest Neighbors:](http://nbviewer.ipython.org/github/markdregan/K-Nearest-Neighbors-with-Dynamic-Time-Warping/blob/master/K_Nearest_Neighbor_Dynamic_Time_Warping.ipynb) Signals from a smart phone gyroscope and accelerometer are used to classify if the person is running, walking, sitting standing etc. This IPython notebook contains a python implementation of DTW and KNN algorithms along with explanations and a practical application. + +* [Digital Signal Processing](https://github.com/spatialaudio/digital-signal-processing-lecture) A collection of notebooks that accompanies a masters course on the topic. + +### Engineering Education + +* [Introduction to Chemical Engineering Analysis](http://jckantor.github.io/CBE20255/) by [Jeff Kantor](http://jckantor.github.io/). A collection of IPython notebooks illustrating topics in introductory chemical engineering analysis, including stoichiometry, generation-consumption analysis, mass and energy balances. + +## Scientific computing and data analysis with the SciPy Stack +### General topics in scientific computing + +* [Algorithms in IPython notebooks](https://github.com/rasbt/algorithms_in_ipython_notebooks), by [Sebastian Raschka](https://github.com/rasbt) + +* [Comparing the performance of Python compilers - Cython vs. Numba vs. Parakeet](_nbs/day4_2_cython_numba_parakeet.ipynb?create=1), by [Sebastian Raschka](https://github.com/rasbt) + +* [A Crash Course in Python for Scientists](http://nbviewer.ipython.org/gist/5920182), by Sandia's [Rick Muller](http://www.cs.sandia.gov/~rmuller). + +* [A gentle introduction to scientific programming in Python, biased towards biologists](http://nbviewer.ipython.org/url/atwallab.cshl.edu/teaching/QBbootcamp3.ipynb), by [Mickey Atwal, Cold Spring Harbor Laboratory](http://atwallab.cshl.edu). + +* [Python for Data Science](http://nbviewer.ipython.org/github/gumption/Python_for_Data_Science/blob/master/Python_for_Data_Science_all.ipynb), a self-contained mini-course with exercises, by [ +Joe McCarthy](http://interrelativity.com/joe). + +* [First few lectures of the UW/Coursera course on Data Analysis](http://nbviewer.ipython.org/github/fonnesbeck/ComputationalMethodsCourse/blob/master/Lecture%201.ipynb). ([Repo](https://github.com/fonnesbeck/ComputationalMethodsCourse)) by Chris Fonnesbeck. + +* [CythonGSL: a Cython interface for the GNU Scientific Library (GSL)](http://nbviewer.ipython.org/github/twiecki/CythonGSL/blob/master/examples/cython_gsl_ipythonnb.ipynb) ([Project repo](https://github.com/twiecki/CythonGSL), by Thomas Wiecki. + +* [Using Numba to speed up numerical codes](http://nbviewer.ipython.org/gist/3914904). And another Numba example: [self-organizing maps](http://nbviewer.ipython.org/gist/3407544). + +* [Numpy performance tricks](http://nbviewer.ipython.org/gist/4645217), and [blog post](http://cyrille.rossant.net/numpy-performance-tricks/), by [Cyrille Rossant](http://cyrille.rossant.net/). + +* [IPython Parallel Push/Execute/Pull Demo](http://nbviewer.ipython.org/gist/3866987) by Justin Riley. + +* [Understanding the design of the R "formula" objects](http://nbviewer.ipython.org/url/perrin.dynevor.org/exploring_r_formula.ipynb) by Matthew Brett. + +* [Comparing different approaches to evolutionary simulations](http://nbviewer.ipython.org/url/www.slideviper.oquanta.info/test/simulation%20frameworks.ipynb). Also available [here](http://slideviewer.herokuapp.com/urls/github.com/yoavram/ipython-notebooks/raw/master/simulation%20frameworks.ipynb) to better visualization. The notebook was converted to a HTML presentation using an old nbconvert with the first developing implementation of `reveal` converter. By Yoav Ram. + +* [The Traveling Salesperson Problem](http://nbviewer.ipython.org/url/norvig.com/ipython/TSPv3.ipynb) by [Peter Norvig](http://norvig.com). + +* [A git tutorial targeted at scientists](http://nbviewer.ipython.org/github/fperez/reprosw/blob/master/Version%20Control.ipynb) by Fernando Perez. + +* [Running MATLAB in an IPython Notebook](http://nbviewer.ipython.org/github/plotly/IPython-plotly/blob/master/See%20more/MATLAB_Wrapper_Examples.ipynb), using [pymatbridge](https://pypi.python.org/pypi/pymatbridge). + +* [Interactive Curve-Fitting](http://nbviewer.ipython.org/gist/danielballan/1c2db3d4f2f7780cf52f) The `lmfit` package provides a widget-based interface to the curve-fitting algorithms in SciPy. + +* [A visual guide to the Python Spark API for distributed computing](http://nbviewer.ipython.org/github/jkthompson/pyspark-pictures/blob/master/pyspark-pictures.ipynb) by Jeff Thompson + +### Social data + +* [Survival Analysis](http://nbviewer.ipython.org/github/CamDavidsonPilon/lifelines/blob/master/docs/Survival%20Analysis%20intro.ipynb), an illustration of the [lifelines library](https://github.com/CamDavidsonPilon/lifelines), by [Cam Davidson Pilon](https://github.com/CamDavidsonPilon). + +* A reconstruction of [Nate Silver's 538 model for the 2012 US Presidential Election](http://nbviewer.ipython.org/github/jseabold/538model/blob/master/silver_model.ipynb), by Skipper Seabold (complete [repo](https://github.com/jseabold/538model)). + +* [Data about the Sandy Hook massacre in Newtown, Conneticut](http://nbviewer.ipython.org/url/www.brianckeegan.com/data/Shootings/Shootings.ipynb), which accompanies a more detailed [blog post on the subject](http://www.brianckeegan.com/2012/12/sandy-hook-school-massacre/). Here are the [notebook and accompanying data](http://www.brianckeegan.com/data/Shootings/Shootings.ipynb). By Brian Keegan. + +* More on [gun violence analysis with Wikipedia data](http://nbviewer.ipython.org/gist/4358066). + +* [An analysis of the Gaza-Israel 2012 crisis](http://nbviewer.ipython.org/gist/4121857). + +* [Ranking NFL Teams](http://nbviewer.ipython.org/github/seanjtaylor/NFLRanking/blob/master/NFL%20Rankings.ipynb). The [full repo](https://github.com/seanjtaylor/NFLRanking) also includes an explanatory slideshow. By Sean Taylor. + +* [Automated processing of news media and generation of associated imagery](http://nbviewer.ipython.org/url/mhermans.net/files/tmp/demo_rdf_HLN.ipynb). + +* [An analysis of national school standardized test data in Colombia using Pandas](http://nbviewer.ipython.org/url/finiterank.com/saber/saber.ipynb) (in Spanish). By [Javier Moreno](http://finiterank.com). + +* [Getting started with GDELT](http://nbviewer.ipython.org/github/dmasad/GDELT_Intro/blob/master/Getting_Started_with_GDELT.ipynb), by [David Masad](https://github.com/dmasad). [GDELT](http://eventdata.psu.edu/data.dir/GDELT.html) is a dataset containing more than 200-million geolocated events with global coverage for 1979 to the present. Another GDELT example from David, that nicely [integrates mapping visualizations](http://nbviewer.ipython.org/github/dmasad/GDELT_Intro/blob/master/GDELT_Mapping.ipynb). + +* [Titanic passengers, coal mining disasters, and vessel speed changes](http://nbviewer.ipython.org/gist/fonnesbeck/8495259), by [Christopher Fonnesbeck](http://biostat.mc.vanderbilt.edu/wiki/Main/ChrisFonnesbeck) + +* A geographic analysis of [Indonesian conflicts in 2012](http://nbviewer.ipython.org/github/herrfz/gdelt/blob/master/indn_connection.ipynb) with GDELT, by [herrfz](https://github.com/herrfz). + +* [Bioinformatic Approaches to the Computation of Poetic Meter](http://nbviewer.ipython.org/github/asp49/meter/blob/graph/Shared%20Horizons%20Presentation.ipynb), by [A. Sean Pue](https://github.com/asp49), [C. Titus Brown](https://github.com/ctb) and [Tracy Teal](https://github.com/tracyteal). + +* Analyzing the [Vélib dataset from Paris](http://nbviewer.ipython.org/gist/5520933), by [Cyrille Rossant](https://github.com/rossant) (Vélib is Paris' [bicycle-sharing program](http://en.wikipedia.org/wiki/V%C3%A9lib')). + +* [Using Python to see how the Times writes about men and women](http://nbviewer.ipython.org/gist/5105037), by [Neal Caren](https://twitter.com/HaphazardSoc). + +* [Exploring graph properties of the Twitter stream with twython and NetworkX](http://nbviewer.ipython.org/gist/5681541/TwitterGraphs.ipynb), by [F. Perez](http://fperez.org) (complete [gist repo with utilities here](https://gist.github.com/fperez/5681541).) + +* [Kaggle Competition: Titanic Machine Learning from Disaster](http://nbviewer.ipython.org/github/agconti/kaggle-titanic/blob/master/Titanic.ipynb). By [Andrew Conti](https://github.com/agconti). + +* [How clean are San Francisco's restaurants?](http://nbviewer.ipython.org/github/Jay-Oh-eN/happy-healthy-hungry/blob/master/h3.ipynb), a data science tutorial that accompanies a [blog post](http://blog.zipfianacademy.com/post/57158627293/how-to-data-science-mapping-sf-restaurant-inspection) from [Zipfian Academy](http://zipfianacademy.com/). + +* [NYT gender wage gap and US crime by state](http://nbviewer.ipython.org/github/plotly/IPython-plotly/blob/master/Bubble%20Chart%20Explorer.ipynb). + +* [Predicting usage of the subway system in NYC](http://nbviewer.ipython.org/url/www.asimihsan.com/articles/Intro%20to%20Data%20Science%20-%20Final%20Project.ipynb), a [final project](http://blog.udacity.com/2014/05/intro-to-data-science-tools-to-ask.html) for the Udacity Intro to Data Science Course, by [Asim Ihsan](http://www.asimihsan.com). + +* [An exploratory statistical analysis of the 2014 World Cup Final](http://nbviewer.ipython.org/github/rjtavares/football-crunching/blob/master/notebooks/an%20exploratory%20data%20analysis%20of%20the%20world%20cup%20final.ipynb), by [Ricardo Tavares](https://github.com/rjtavares). Part of a [notebook collection on football (aka soccer) analysis](http://nbviewer.ipython.org/github/rjtavares/football-crunching/tree/master/). + +* [San Francisco's Drug Geography](http://nbviewer.ipython.org/github/lmart999/GIS/blob/master/SF_GIS_Crime.ipynb), a GIS analysis of public crime data in SF, by [Lance Martin](https://github.com/lmart999). + +### Psychology and Neuroscience +* [Cue Combination with Neural Populations](http://nbviewer.ipython.org/github/wtadler/cue-combination-with-neurons/blob/master/neural_cue_combination.ipynb) by [Will Adler](http://www.wtadler.com). Intuition and simulation for the theory (Ma *et al.*, 2006) that through probabilistic population codes, neurons can perform optimal cue combination with simple linear operations. Demonstrates that variance in cortical activity, rather than impairing sensory systems, is an adaptive mechanism to encode uncertainty in sensory measurements. + +* [Modeling psychophysical data with non-linear functions](http://nbviewer.ipython.org/github/arokem/teach_optimization/blob/master/optimization.ipynb) by Ariel Rokem. + +* [Visualizing mathematical models of brain cell connections](http://nbviewer.ipython.org/github/jonasnick/ReceptiveFields/blob/master/receptiveFields.ipynb). The effect of convolution of different receptive field functions and natural images is examined. + +* [Python for Vision Research](http://nbviewer.ipython.org/github/gestaltrevision/python_for_visres/blob/master/index.ipynb). A three-day crash course for vision researchers in programming with Python, building experiments with [PsychoPy](http://psychopy.org/) and [psychopy_ext](http://psychopy_ext.klab.lt/), learning the fMRI multi-voxel pattern analysis with [PyMVPA](http://www.pymvpa.org/), and understading image processing in Python. + +* [Loading and visualizing fMRI data](http://nbviewer.ipython.org/github/GaelVaroquaux/nilearn_course/blob/master/rendered_notebooks/1_Introduction.ipynb), part of the [Functional connectivity with NiLearn course](https://github.com/GaelVaroquaux/nilearn_course) by [Gaël Varoquaux](http://gael-varoquaux.info). + +### Machine Learning, Statistics and Probability + +* [An introduction to parallel machine learning with sklearn, joblib and IPython.parallel](http://nbviewer.ipython.org/gist/5115540/Model%20Selection%20for%20the%20Nystroem%20Method.ipynb), a notebook that accompanies this [slide deck](https://speakerdeck.com/ogrisel/parallel-and-large-scale-machine-learning-with-scikit-learn) by [Olivier Grisel](https://github.com/ogrisel). + +* [A tutorial introduction to machine learning with sklearn](http://amueller.github.com/sklearn_tutorial), an IPython-based slide deck by [Andreas Mueller](https://github.com/amueller). + +* [Introduction to Machine Learning in Python with scikit-learn](http://ipython-books.github.io/featured-04/) by [Cyrille Rossant](http://cyrille.rossant.net/), a free recipe from the [IPython Cookbook](http://ipython-books.github.io/cookbook/), a comprehensive guide to Python for Data Science. + +* [An introduction to Predictive Modeling in Python](http://nbviewer.ipython.org/github/ogrisel/parallel_ml_tutorial/blob/master/solutions/01%20-%20An%20Introduction%20to%20Predictive%20Modeling%20in%20Python.ipynb), by [Olivier Grisel](https://github.com/ogrisel). + +* [Face Recognition on a subset of the Labeled Faces in the Wild dataset](http://nbviewer.ipython.org/github/ogrisel/notebooks/blob/master/Labeled%20Faces%20in%20the%20Wild%20recognition.ipynb), by [Olivier Grisel](http://github.com/ogrisel). + +* [An Introduction to Bayesian Methods for Multilevel Modeling](http://nbviewer.ipython.org/github/fonnesbeck/multilevel_modeling/blob/master/multilevel_modeling.ipynb), by [Chris Fonnesbeck](http://github.com/fonnesbeck). + +* [Introduction to Bayesian Networks](http://nbviewer.ipython.org/github/kuitang/hackny-bayesnet/blob/master/hackNY%20Bayesian%20Network%20Demo.ipynb) by [Kui Tang](https://github.com/kuitang) + +* [Bayesian data analysis with PyMC3](http://nbviewer.ipython.org/github/twiecki/pymc3_talk/blob/master/bayesian_pymc3.ipynb) by [Thomas Wiecki](https://github.com/twiecki). + +* [A collection of examples for solving pattern classification problems](https://github.com/rasbt/pattern_classification), by [Sebastian Raschka](https://github.com/rasbt). + +* [Introduction to Linear Regression using Python](http://nbviewer.ipython.org/github/justmarkham/DAT4/blob/master/notebooks/08_linear_regression.ipynb) by [Kevin Markham](https://github.com/justmarkham) + +* [Machine learning in Python](http://nbviewer.ipython.org/github/jdwittenauer/ipython-notebooks/blob/master/ML-Exercise1.ipynb), a series based on Andrew Ng's Coursera class on machine learning. Part of a [larger collection of data science notebooks](https://github.com/jdwittenauer/ipython-notebooks) by [John Wittenauer](https://github.com/jdwittenauer). + +* [Probability, Paradox, and the Reasonable Person Principle](http://nbviewer.ipython.org/url/norvig.com/ipython/Probability.ipynb), by Peter Norvig. + +### Physics, Chemistry and Biology + +* [Multibody dynamics and control with Python](http://www.moorepants.info/blog/npendulum.html) and the [notebook file](http://nbviewer.ipython.org/github/gilbertgede/idetc-2013-paper/blob/master/n-pendulum-control.ipynb) by Jason K. Moore. + +* [Manipulation and display of chemical structures](http://nbviewer.ipython.org/gist/4316433), by [Greg Landrum](https://github.com/greglandrum), using rdkit. + +* [The sound of Hydrogen](http://nbviewer.ipython.org/github/Carreau/posts/blob/master/07-the-sound-of-hydrogen.ipynb), visualizing and listening to the quantum-mechanical spectrum of Hydrogen. By [Matthias Bussonnier](http://github.com/Carreau). + +* Particle physics at the Large Hadron Collider (LHC): using [ROOT in an LHCb masterclass](http://root.cern.ch/drupal/content/ipython-notebooks-and-root-0): [Notebook 1](http://nbviewer.ipython.org/github/mazurov/webfest2013/blob/master/notebooks/MasterClassD0-ex1.ipynb) and [Notebook 2](http://nbviewer.ipython.org/github/mazurov/webfest2013/blob/master/notebooks/MasterClassD0-ex2%2Cex3.ipynb) notebooks by Alexander Mazurov and Andrey Ustyuzhanin at CERN. + +* [A Reaction-Diffusion Equation Solver in Python with Numpy](http://nbviewer.ipython.org/github/waltherg/notebooks/blob/master/2013-12-03-Crank_Nicolson.ipynb), a demonstration of how IPython notebooks can be used to discuss both the theory and implementation of numerical algorithms on one page, by [Georg Walther](https://github.com/waltherg). + +* [Comparing different approaches to evolutionary simulations](http://nbviewer.ipython.org/url/www.slideviper.oquanta.info/test/simulation%20frameworks.ipynb). Also available [here](http://slideviewer.herokuapp.com/urls/github.com/yoavram/ipython-notebooks/raw/master/simulation%20frameworks.ipynb) to better visualization. The notebook was converted to a HTML presentation using an old nbconvert with the first developing implementation of `reveal` converter. By Yoav Ram. + +### Economics and Finance + +* [Replication of the highly-contentious analysis of economic growth by Reinhart and Rogoff](http://nbviewer.ipython.org/github/vincentarelbundock/Reinhart-Rogoff/blob/master/reinhart-rogoff.ipynb), by [Vincent Arel-Bundock](http://umich.edu/~varel), [full repo here](https://github.com/vincentarelbundock/Reinhart-Rogoff). This is based on the [widely-publicized critique of the original analysis done by Herndon, Ash, and Pollin](http://www.peri.umass.edu/236/hash/31e2ff374b6377b2ddec04deaa6388b1/publication/566). + +* [fecon235 for Financial Economics](https://github.com/rsvp/fecon235) series of notebooks which examines time-series data for economics and finance. Easy API to freely access data from the Federal Reserve, SEC, CFTC, stock and futures exchanges. Thus research from older notebooks can be replicated, and updated using the most current data. For example, this notebook forecasts likely Fed policy for setting the [Fed Funds rate](https://git.io/fedfunds), but market sentiment across major asset classes is observable from the [CFTC Commitment of Traders Report](https://git.io/cotr). Major economics indicators are renormalized: for example, various measures of [inflation](https://git.io/infl), optionally with the forward-looking break-even rates derived from U.S. Treasury bonds. Other notebooks examine international markets: especially, gold and foreign exchange. + +### Earth science and geo-spatial data +* [Exploring seafloor habitats: geographic analysis using IPython Notebook with GRASS & R](http://nbviewer.ipython.org/gist/7598354). This embeds a slideshow and a Web Spinning Globe (Cesium) in the notebook. By Massimo Di Stefano. + +* [Geo-Spatial Data with IPython](http://nbviewer.ipython.org/github/mqlaql/geospatial-data/blob/master/Geospatial-Data-with-Python.ipynb). Tutorial by [Kelsey Jordahl](http://kjordahl.net/) from SciPy2013. + +### Data visualization and plotting + +* A [Notebook](http://nbviewer.ipython.org/github/plotly/python-user-guide/blob/master/s3_bubble-charts/s3_bubble-charts.ipynb) with an interactive Hans Rosling Gapminder bubble chart from [Plotly](https://plot.ly/api/Python). + +* [Data and visualization integration via web based resources](http://tw.rpi.edu/media/2013/09/25/a48/The_Perfect_Storm_1991.html). Using NetCDF, Matplotlib, IPython Parallel and ffmpeg to generate video animation from time series of gridded data. By Massimo Di Stefano. + +* [21 Interactive, D3 Plots from matplotlib, ggplot for Python, +prettyplotlib, Stack Overflow, and seaborn](http://nbviewer.ipython.org/gist/msund/7ac1203ded66fe8134cc). + +* [Visualizing complex-valued functions with Matplotlib and Mayavi](http://nbviewer.ipython.org/github/empet/Math/blob/master/DomainColoring.ipynb), by [Emilia Petrisor](https://github.com/empet). + +* [A D3 Viewer for Matplotlib Visualizations](http://jakevdp.github.io/blog/2013/12/19/a-d3-viewer-for-matplotlib/), different from above by not depending on Plot.ly account. + +* [Bokeh is an interactive web visualization library for Python](http://nbviewer.ipython.org/github/damianavila/bokeh_overview/blob/master/Bokeh%20Overview.ipynb?create=1) (and other languages). It provides d3-like novel graphics, over large datasets, all without requiring any knowledge of Javascript. It also has a Matplotlib compatibility layer. + +* [Winner of the 2014 E. Tufte Slope Graphs contest](http://nbviewer.ipython.org/gist/pascal-schetelat/8382651), by [Pascal Schetelat](https://github.com/pascal-schetelat). The original [contest info on Tufte's site](http://www.edwardtufte.com/bboard/q-and-a-fetch-msg?msg_id=0003nk). + +* [matta, d3.js-based visualizations in the IPython Notebook](http://nbviewer.ipython.org/github/carnby/matta/blob/master/examples/Basic%20Examples.ipynb), by [Eduardo Graells-Garrido](http://carnby.github.io/). + +### Mathematics + +* [Linear algebra with Cython](http://nbviewer.ipython.org/github/carljv/cython_testing/blob/master/cython_linalg.ipynb). A tutorial that styles the notebook differently to show that you can produce high-quality typography online with the Notebook. By Carl Vogel. + +* [Exploring how smooth-looking functions can have very surprising derivatives even at low orders](http://nbviewer.ipython.org/url/finiterank.com/cuadernos/suavesylocas.ipynb), combining SymPy and matplotlib. By [Javier Moreno](http://finiterank.com). + +* [A Collection of Applied Mathematics and Machine Learning Tutorials](http://sayilarvekuramlar.blogspot.co.uk/2015/12/matematik-ders-notlari.html) (in Turkish). By Burak Bayramli. + +* [Function minimization with iminuit](http://nbviewer.ipython.org/github/iminuit/iminuit/blob/master/tutorial/tutorial.ipynb), an introductory companion to their [hard core tutorial](http://nbviewer.ipython.org/github/iminuit/iminuit/blob/master/tutorial/hard-core-tutorial.ipynb). By the [iminuit project](http://iminuit.github.io/iminuit). + +* [The Discrete Cosine Transform](http://nbviewer.ipython.org/url/cs.marlboro.edu/courses/spring2014/information/code/dct/dct.ipynb), a brief explanation and illustration of the math behind the DCT and its role in the JPEG image format, by [Jim Mahoney](http://cs.marlboro.edu). + +* [Chebfun in Python](http://nbviewer.ipython.org/gist/6724986), a demo of [PyChebfun](https://github.com/cswiercz/pychebfun), by [Olivier Verdier](http://www.olivierverdier.com). PyChebfun is a pure-python implementation of the celebrated [Chebfun package by Battles and Trefethen](http://people.maths.ox.ac.uk/trefethen/publication/PDF/2004_107.pdf). + +* [The Matrix Exponential] +(http://nbviewer.ipython.org/github/sdrelton/matrix_function_notebooks/blob/master/TheMatrixExponential.ipynb), an introduction to the matrix exponential, its applications, and a list of available software in Python and MATLAB. By [Sam Relton](http://www.maths.manchester.ac.uk/~srelton/). + +* [Fractals, complex numbers, and your imagination](http://nbviewer.ipython.org/github/cfangmeier/ipython_notebooks/blob/master/Imagination.ipynb), by [Caleb Fangmeier](https://github.com/cfangmeier). + +* [A SymPy tutorial](http://nbviewer.ipython.org/url/www.inp.nsk.su/~grozin/python/sympy.ipynb), by [Andrey Grozin](http://www.inp.nsk.su/~grozin/). + +### Signal and Sound Processing + +* [Simulation of Delta Sigma modulators in Python](http://nbviewer.ipython.org/github/ggventurini/python-deltasigma/blob/master/examples/dsdemo1.ipynb) with [deltasigma](https://github.com/ggventurini/python-deltasigma), Python port of of Richard Schreier's *excellent* [MATLAB Delta Sigma Toolbox](http://www.mathworks.com/matlabcentral/fileexchange/19-delta-sigma-toolbox), by [Giuseppe Venturini](https://github.com/ggventurini). Several demonstrative notebooks on the package [README](https://github.com/ggventurini/python-deltasigma/blob/master/README.md). + +* [PyOracle: Automatic analysis of musical structure](http://nbviewer.ipython.org/urls/bitbucket.org/pucktronix/pyoracle/raw/d046b2bcf473503fa356094cfe4cff774d1aaefc/270D.ipynb), by [Greg Surges](http://gregsurges.com). + +* [A Gallery of SciPy's Window Functions for quick visual inspection and comparison] (http://nbviewer.ipython.org/urls/gist.githubusercontent.com/jaidevd/b7d865f7f4b237ab5181/raw/30bc8f998bf8f924b56b32ce10acce125656ed7c/scipy_window_gallery.ipynb) + by [Jaidev Deshpande](http://twitter.com/jaidevd) + +### Natural Language Processing + +* [Python Programming for the Humanities](http://fbkarsdorp.github.io/python-course/) by Folgert Karsdorp & Maarten van Gompel. + + + + +### Pandas for data analysis + +Note that in the 'collections' section above there are also pandas-related links, such as the one for an [11-lesson tutorial](https://bitbucket.org/hrojas/learn-pandas). + +* [A 10-minute whirlwind tour of pandas](http://nbviewer.ipython.org/gist/wesm/4757075/PandasTour.ipynb), this is the notebook accompanying a [video presentation](http://vimeo.com/59324550) by Wes McKinney, author of Pandas and the [Python for Data Analysis](http://www.amazon.com/Python-Data-Analysis-Wes-McKinney/dp/1449319793) book. + +* [Time-series analysis with Pandas](http://nbviewer.ipython.org/github/changhiskhan/talks/blob/master/pydata2012/pandas_timeseries.ipynb). + +* [Financial data analysis with Pandas](http://nbviewer.ipython.org/gist/3962843). + +* [Clustering of smartphone sensor data for human activity detection using pandas and scipy](http://nbviewer.ipython.org/github/herrfz/dataanalysis/blob/master/week4/clustering_example.ipynb), part of Coursera data analysis course, done in Python ([repo](https://github.com/herrfz/dataanalysis)). + +* [Log analysis with Pandas](http://nbviewer.ipython.org/url/taaviburns.ca/presentations/log_analysis_with_pandas/nb/5-Scatterplots.ipynb), part of a [group presented at PyConCa 2012](http://taaviburns.ca/presentations/log_analysis_with_pandas/) by Taavi Burns. + +* [Analyzing and visualizing sun spot data with Pandas](http://nbviewer.ipython.org/gist/4569783), by [Josh Hemann](https://github.com/jhemann). An enlightening discussion of how naive plotting choices subtly influence our interpretation of data. + +* [Advanced analysis of Apache logs](http://nbviewer.ipython.org/github/koldunovn/nk_public_notebooks/blob/master/Apache_log.ipynb), by [Nikolay Koldunov](https://github.com/koldunovn). + +* [Statistical Data Analysis in Python] (https://github.com/fonnesbeck/statistical-analysis-python-tutorial), by [Christopher Fonnesbeck](https://github.com/fonnesbeck/), SciPy 2013. Companion videos [1](https://www.youtube.com/watch?v=DXPwSiRTxYY), [2](https://www.youtube.com/watch?v=TGEBpzJUxdI), [3](https://www.youtube.com/watch?v=YZDtBEEZuAk), [4](https://www.youtube.com/watch?v=5_rcdhBXD-0) + +## General Python Programming + +* [Learning to code with Python](http://nbviewer.ipython.org/urls/bitbucket.org/amjoconn/watpy-learning-to-code-with-python/raw/3441274a54c7ff6ff3e37285aafcbbd8cb4774f0/notebook/Learn%20to%20Code%20with%20Python.ipynb), part of an [introduction to Python](https://bitbucket.org/amjoconn/watpy-learning-to-code-with-python/src) from the [Waterloo Python users group](http://watpy.ca/blog/post/learn-code-python-review-feb-2013). + +* [Python Descriptors Demystified](http://nbviewer.ipython.org/gist/ChrisBeaumont/5758381/descriptor_writeup.ipynb), an in-depth discussion of the descriptor protocol in Python, by [Chris Beaumont](http://chrisbeaumont.org). + +* [A collection of not so obvious Python stuff you should know!](http://nbviewer.ipython.org/github/rasbt/python_reference/blob/master/tutorials/not_so_obvious_python_stuff.ipynb?create=1), by [Sebastian Raschka](https://github.com/rasbt). + +* [Key differences between Python 2.7.x and Python 3.x](http://nbviewer.ipython.org/github/rasbt/python_reference/blob/master/tutorials/key_differences_between_python_2_and_3.ipynb), by [Sebastian Raschka](https://github.com/rasbt). + +* [A beginner's guide to Python's namespaces, scope resolution, and the LEGB rule](http://nbviewer.ipython.org/github/rasbt/python_reference/blob/master/tutorials/scope_resolution_legb_rule.ipynb?create=1), by [Sebastian Raschka](https://github.com/rasbt). + +* [Sorting CSV files using the Python csv module](http://nbviewer.ipython.org/github/rasbt/python_reference/blob/master/tutorials/sorting_csvs.ipynb), by [Sebastian Raschka](https://github.com/rasbt). + +* Python 3 OOP series by [Leonardo Giordani](https://github.com/lgiordani): [Part 1: Objects and types](http://nbviewer.ipython.org/github/lgiordani/blog_source/blob/master/pelican/content/notebooks/Python_3_OOP_Part_1__Objects_and_types.ipynb), [Part 2: Classes and members](http://nbviewer.ipython.org/github/lgiordani/blog_source/blob/master/pelican/content/notebooks/Python_3_OOP_Part_2__Classes_and_members.ipynb), [Part 3: Delegation - composition and inheritance](http://nbviewer.ipython.org/github/lgiordani/blog_source/blob/master/pelican/content/notebooks/Python_3_OOP_Part_3__Delegation__composition_and_inheritance.ipynb), [Part 4: Polymorphism](http://nbviewer.ipython.org/github/lgiordani/blog_source/blob/master/pelican/content/notebooks/Python_3_OOP_Part_4__Polymorphism.ipynb), [Part 5: Metaclasses](http://nbviewer.ipython.org/github/lgiordani/blog_source/blob/master/pelican/content/notebooks/Python_3_OOP_Part_5__Metaclasses.ipynb), [Part 6: Abstract Base Classes](http://nbviewer.ipython.org/github/lgiordani/blog_source/blob/master/pelican/content/notebooks/Python_3_OOP_Part_6__Abstract_Base_Classes.ipynb) + +## Notebooks in languages other than Python + +These are notebooks that use [one of the IPython kernels for other languages](IPython kernels for other languages): + +### Julia + +The IPython protocols to communicate between kernels and clients are language agnostic, and other programming language communities have started to build support for this protocol in their language. The Julia team has created [IJulia](https://github.com/JuliaLang/IJulia.jl), and these are some Julia notebooks: + +* [Fractals 3 ways](http://nbviewer.ipython.org/url/beowulf.csail.mit.edu/18.337/fractals.ipynb), by [Jeff Bezanson](https://github.com/JeffBezanson). + +* [The Design Impact of Multiple Dispatch](http://nbviewer.ipython.org/gist/StefanKarpinski/b8fe9dbb36c1427b9f22), a detailed explanation of Julia's multiple dispatch design, by [Stefan Karpinski](https://github.com/StefanKarpinski). + +* A [tutorial](http://nbviewer.ipython.org/gist/7551139) on making interactive graphs with [Plotly](https://plot.ly/) and Julia. + +* [Numerical tours in Julia](http://www.numerical-tours.com/julia/) + +* [Functional Geometry](http://nbviewer.ipython.org/github/shashi/ijulia-notebooks/blob/master/funcgeo/Functional%20Geometry.ipynb) by [Shashi Gowda](https://github.com/shashi) + +* [JuliaOpt notebooks](http://nbviewer.ipython.org/github/JuliaOpt/juliaopt-notebooks/tree/master/notebooks/), a collection of optimization-related notebooks. + +* Coursework using IJulia notebooks: + - [Métodos Numéricos Avanzados (2015-2)](https://github.com/dpsanders/MetodosNumericosAvanzados), Luis Benet and David P. Sanders + - [Métodos Monte Carlo](https://github.com/dpsanders/metodos-monte-carlo), David Sanders + - [Linear Partial Differential Equations: Analysis and Numerics](http://math.mit.edu/~stevenj/18.303/), Steven G. Johnson + - [Julia tutorial for Computational Molecular Biology](http://cs.brown.edu/courses/csci1810/Julia_Tutorial.html), Younhun Kim and Matthew Reyna + +* Other collections of IJulia notebooks: + - [Jiahao Chen](http://jiahao.github.io/code/) + - [Christoph Ortner](https://homepages.warwick.ac.uk/staff/C.Ortner/index.php?page=julia) + - [Crossing Language Barriers with Julia, Scipy, and IPython](https://github.com/stevengj/Julia-EuroSciPy14), presented at EuroSciPy '14 by Steven G. Johnson. + +### Haskell +There exists a Haskell kernel for IPython in the [IHaskell project](https://github.com/gibiansky/ihaskell). + +* [IHaskell Demo Notebook](http://nbviewer.ipython.org/github/gibiansky/IHaskell/blob/master/notebooks/IHaskell.ipynb) +* [Homophone reduction](http://nbviewer.ipython.org/github/gibiansky/IHaskell/blob/master/notebooks/Homophones.ipynb), a solution to a cute problem involving treating English letters as generators of a large group. +* [Gradient descent typeclass](http://nbviewer.ipython.org/github/gibiansky/IHaskell/blob/master/notebooks/Gradient-Descent.ipynb), a look at how arbitrary gradient descent algorithms can be represented with a typeclass. + +### OCaml + +[iocaml](https://github.com/andrewray/iocaml) is an OCaml kernel for IPython + +* [H.261 Video Decoding in OCaml](https://andrewray.github.io/iocamljs/oh261.html) +* [OCaml implementation of the 2048 game](http://gazagnaire.org/fuconf14/) + +### Ruby + +Similar to the Julia kernel there exists also a [Ruby kernel](https://github.com/SciRuby/iruby) for IPython. + +* [IRuby Demo Notebook](http://nbviewer.ipython.org/github/SciRuby/sciruby-notebooks/blob/master/getting_started.ipynb) +* [SciRuby Notebooks](https://github.com/SciRuby/sciruby-notebooks) + +The interactive plotting library [Nyaplot](https://github.com/domitry/nyaplot) has some case studies using IRuby: + +* [War expenditure per GDP](http://nbviewer.ipython.org/github/domitry/nyaplot/blob/master/examples/notebook/Mapnya.ipynb#Case2-:Fill-countries-in-different-colors) +* [Finding shape consensus among multiple geometrical polygons](http://nbviewer.ipython.org/gist/mgiraldo/a68b53175ce5892531bc) + +### Perl + +* An example showcasing full use of the [display protocol](http://nbviewer.ipython.org/github/zmughal/zmughal-iperl-notebooks/blob/master/IPerl-demos/20150209_IPerl_display_demo.ipynb) with the IPerl kernel. + +## Miscellaneous topics about doing various things *with* the Notebook itself + +* [Blogging With IPython in Blogger](http://nbviewer.ipython.org/github/fperez/blog/blob/master/120907-Blogging%20with%20the%20IPython%20Notebook.ipynb), also available in [blog post form](http://blog.fperez.org/2012/09/blogging-with-ipython-notebook.html), [full repo here](https://github.com/fperez/blog). By Fernando Perez. + +* [Blogging With IPython in Octopress](http://nbviewer.ipython.org/urls/github.com/jakevdp/jakevdp.github.com/raw/master/downloads/notebooks/nb_in_octopress.ipynb), by Jake van der Plas and available as a [blog post](http://jakevdp.github.com/blog/2012/10/04/blogging-with-ipython/). Other [notebooks by Jake](https://github.com/jakevdp/jakevdp.github.com/tree/master/downloads/notebooks) contain many more great examples of doing interesting work with the scientific Python stack. + +* [Blogging With IPython in Nikola](http://nbviewer.ipython.org/url/www.damian.oquanta.info/posts/blogging-with-nikola-and-ipython.ipynb), also available in [blog post form](http://www.damian.oquanta.info/posts/blogging-with-nikola-and-ipython.html) by Damián Avila. + +* [Custom CSS control of the notebook](http://nbviewer.ipython.org/github/Carreau/posts/blob/master/Blog1.ipynb), this is part of a [blog repo](https://github.com/Carreau/posts) by Matthias Bussonnier. + +* [IPython display hookery: tools to help display visual output from various sources](http://nbviewer.ipython.org/gist/5162445), a gist by [@deeplook](https://github.com/deeplook). + +* [Importing IPython Notebooks as Modules](http://nbviewer.ipython.org/gist/6011986) by [Min RK](http://github.com/minrk). + +## Reproducible academic publications + +This section contains academic papers that have been published in the peer-reviewed literature or pre-print sites such as the [ArXiv](http://arxiv.org) that include one or more notebooks that enable (even if only partially) readers to reproduce the results of the publication. If you include a publication here, please link to the journal article as well as providing the nbviewer notebook link (and any other relevant resources associated with the paper). + +1. [Reply to 'Influence of cosmic ray variability on the monsoon rainfall and temperature': a false-positive in the field of solar-terrestrial research](http://arxiv.org/abs/1502.00505) by [Benjamin Laken](http://www.benlaken.com), 2015. Reviewed article will appear in JASTP. The [IPython notebook](http://nbviewer.ipython.org/github/benlaken/Comment_BadruddinAslam2014/blob/master/Monsoon_analysis.ipynb) reproduces the full analysis and figures exactly as they appear in the article, and is available on Github: link via [figshare](http://figshare.com/articles/Comment_on_Badruddin_amp_Aslam_2014_/1299413). + +1. [The probability of improvement in Fisher's geometric model: a probabilistic approach](http://dx.doi.org/10.1016/j.tpb.2014.10.004), by [Yoav Ram](http://www.yoavram.com/) and [Lilach Hadany](https://sites.google.com/site/hadanylab/). (Theoretical Population Biology, 2014). An [IPython notebook](http://nbviewer.ipython.org/url/www.sciencedirect.com/science/MiamiMultiMediaURL/1-s2.0-S0040580914000811/1-s2.0-S0040580914000811-mmc1.txt/272364/FULL/S0040580914000811/471cf02085a52c248dc76ae65ad4409d/mmc1.txt), allowing figure reproduction, was deposited as a [supplementry file](http://www.sciencedirect.com/science/MiamiMultiMediaURL/1-s2.0-S0040580914000811/1-s2.0-S0040580914000811-mmc1.txt/272364/FULL/S0040580914000811/471cf02085a52c248dc76ae65ad4409d/mmc1.txt). + +1. [Stress-induced mutagenesis and complex adaptation](http://rspb.royalsocietypublishing.org/content/281/1792/20141025.abstract), by [Yoav Ram](http://www.yoavram.com/) and [Lilach Hadany](https://sites.google.com/site/hadanylab/) (Proceedings B, 2014). An [IPython notebook](https://github.com/yoavram/ruggedsim/blob/master/manuscript/supplementry.ipynb), allowing figures reproduction, was deposited as a [supplementry file](http://rspb.royalsocietypublishing.org/content/suppl/2014/08/19/rspb.2014.1025.DC1). + +1. [Automatic segmentation of odor maps in the mouse olfactory bulb using regularized non-negative matrix factorization](http://www.sciencedirect.com/science/article/pii/S1053811914003103), by J. Soelter et al. (Neuroimage 2014, Open Access). The [notebook](http://nbviewer.ipython.org/github/jansoe/FUImaging/blob/master/examples/IOSsegmentation/regNMF.ipynb) allows to reproduce most figures from the paper and provides a deeper look at the data. The [full code repository](https://github.com/jansoe/FUImaging/tree/Neuroimage2014) is also available. + +1. [Multi-tiered genomic analysis of head and neck cancer ties TP53 mutation to 3p loss, by A. Gross et al. (Nature Genetics 2014)](http://www.nature.com/ng/journal/vaop/ncurrent/full/ng.3051.html). The [full collection of notebooks to replicate the results](https://github.com/theandygross/TCGA/tree/master/Analysis_Notebooks#guide-to-running). + +1. [powerlaw: a Python package for analysis of heavy-tailed distributions, by J. Alstott et al.](https://code.google.com/p/powerlaw/). [Notebook of examples in manuscript](http://nbviewer.ipython.org/gist/19fcdd6a4ba400ce8de2), [ArXiv link](http://arxiv.org/abs/1305.0215) and [project repository](https://github.com/jeffalstott/powerlaw). + +1. [Collaborative cloud-enabled tools allow rapid, reproducible biological insights, by B. Ragan-Kelley et al.](http://www.nature.com/ismej/journal/v7/n3/full/ismej2012123a.html). The [main notebook](http://nbviewer.ipython.org/gist/3693491/cloud_demo_complete.ipynb), the [full collection of related notebooks](http://nbviewer.ipython.org/gist/3693491) and the [companion site](http://qiime.org/home_static/nih-cloud-apr2012) with the Amazon AMI information for reproducing the full paper. + +1. [A Reference-Free Algorithm for Computational Normalization of Shotgun Sequencing Data, by C.T. Brown et al.](http://ged.msu.edu/papers/2012-diginorm). [Full notebook](http://nbviewer.ipython.org/urls/github.com/ged-lab/2012-paper-diginorm/raw/master/notebook/diginorm.ipynb), [ArXiv link](http://arxiv.org/abs/1203.4802) and [project repository](https://github.com/ged-lab/2012-paper-diginorm). + +1. [The kinematics of the Local Group in a cosmological context](http://arxiv.org/abs/1303.2690) by [J.E. Forero-Romero et al.](http://wwwprof.uniandes.edu.co/~je.forero/). The [Full notebook](http://nbviewer.ipython.org/github/forero/LG_Kinematics/blob/master/code/main_analysis.ipynb) and also all the data in a [github repo](https://github.com/forero/LG_Kinematics). + +1. [Warming Ocean Threatens Sea Life](http://www.scientificamerican.com/article.cfm?id=warming-ocean-threatens-sea-life), an article in Scientific American [backed by a notebook for its main plot](http://nbviewer.ipython.org/github/robertodealmeida/notebooks/blob/master/scientific_american/Scientific%20American%20graph.ipynb). By [Roberto de Almeida](https://github.com/robertodealmeida) from [MarinExplore](https://marinexplore.com). + +1. [Extrapolating Weak Selection in Evolutionary Games](http://nbviewer.ipython.org/github/juliangarcia/ews/blob/master/notebook.ipynb), by Wu, García, Hauert and Traulsen. [PLOS Comp Bio paper](http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003381) and [Figshare link](http://figshare.com/articles/Extrapolating_weak_selection_in_evolutionary_games_source_code/814470). + +1. [Using neural networks to estimate redshift distributions. An application to CFHTLenS] +(http://nbviewer.ipython.org/urls/bitbucket.org/christopher_bonnett/nn_notebook/raw/5e69b55193a229cb2076a2f18e43b45c56e317e0/T-800.ipynb) +by Christopher Bonnett [paper](http://arxiv.org/abs/1312.1287)(submitted to MNRAS) + +1. [Mechanisms for stable, robust, and adaptive development of orientation maps in the primary visual cortex](http://dx.doi.org/10.1523/JNEUROSCI.1037-13.2013) by Jean-Luc R. Stevens, Judith S. Law, Jan Antolik, and James A. Bednar. Journal of Neuroscience, 33:15747-15766, 2013. [Notebook1] +(https://ioam.github.io/topographica/_static/gcal_notebook.html), [Notebook2](https://ioam.github.io/topographica/_static/stevens_jn13_notebook.html). + +1. [Accelerated Randomized Benchmarking](http://nbviewer.ipython.org/github/cgranade/accelerated-randomized-benchmarking/blob/master/src/model_testing.ipynb), by [Christopher Granade](http://www.cgranade.com/), [Christopher Ferrie](https://sites.google.com/site/csferrie/) and D. G. Cory. [New Journal of Physics **17** 013042 (2015)](http://iopscience.iop.org/article/10.1088/1367-2630/17/1/013042/meta;jsessionid=1F48CDD7C7D7849B0777C495ED1551CC.c1), [arXiv](http://arxiv.org/abs/1404.5275), [GitHub repo](https://github.com/cgranade/accelerated-randomized-benchmarking). + +1. [Dynamics and associations of microbial community types across the human body](http://dx.doi.org/10.1038/nature13178), by Tao Ding & Patrick D. Schloss. [Notebook replicating results](http://nbviewer.ipython.org/gist/pschloss/9815766/notebook.ipynb). + +1. [Variations in submarine channel sinuosity as a function of latitude and slope](http://nbviewer.ipython.org/gist/zsylvester/6040d0015b9b907bc788), by Sylvester, Z., Pirmez, C., Cantelli, A., & Jobe, Z. R. + +1. [Frontoparietal representations of task context support the flexible control of goal directed cognition](http://www.jneurosci.org/content/34/32/10743.short), by M.L. Waskom, D. Kumaran, A.M. Gordon, J. Rissman, & A.D. Wagner. [Github repository](https://github.com/WagnerLabPapers/Waskom_JNeurosci_2014) | [Main notebook](http://nbviewer.ipython.org/github/WagnerLabPapers/Waskom_JNeurosci_2014/blob/master/Behavioral_and_Decoding_Analyses.ipynb) + +1. [pyparty: Intuitive Particle Processing in Python](http://openresearchsoftware.metajnl.com/article/view/jors.bh), Adam Hughes [Notebook to Generate the Published Figures](http://nbviewer.ipython.org/github/hugadams/pyparty/blob/master/examples/Notebooks/JORS_data.ipynb?create=1) | Also, check out the [pyparty tutorial notebooks](https://github.com/hugadams/pyparty). + +1. [Indication of family-specific DNA methylation patterns in developing oysters](http://biorxiv.org/content/early/2014/12/16/012831), Claire E. Olson, Steven B. Roberts +doi: http://dx.doi.org/10.1101/012831. [Notebook to generate results in the paper](http://nbviewer.ipython.org/github/che625/olson-ms-nb/blob/master/BiGo_dev.ipynb). + +1. [Parallel Prefix Polymorphism Permits Parallelization, Presentation & Proof](http://conferences.computer.org/hptcdl/2014/papers/7020a047.pdf), [Jiahao Chen](http://jiahao.github.io) and [Alan Edelman](http://www-math.mit.edu/~edelman/), HPTCDL'14. [Website](http://jiahao.github.io/parallel-prefix) and [notebook](https://github.com/jiahao/ijulia-notebooks/blob/master/2014-08-06-parallel-prefix.ipynb) + +1. [Transcriptome Sequencing Reveals Potential Mechanism of Cryptic 3’ Splice Site Selection in *SF3B1*-mutated Cancers](http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1004105) by [Christopher DeBoever](http://cdeboever3.github.io/) et al. There are several [notebooks](https://github.com/cdeboever3/deboever-sf3b1-2015/tree/master/notebooks) to replicate results and make figures. + +1. [A Workflow for Characterizing Nanoparticle Monolayers for Biosensors: Machine Learning on Real and Artificial SEM Images](https://peerj.com/preprints/671/), Adam Hughes, Zhaowen Liu, Maryam Raftari, Mark. E Reeves. Notebooks are linked in Table 1 in the text. + +1. [AtomPy: An Open Atomic Data Curation Environment for Astrophysical Applications](http://www.mdpi.com/2218-2004/2/2/123), by C. Mendoza, J. Boswell, D. Ajoku, M. Bautista. + +1. [Visualizing 4-Dimensional Asteroids](http://blogs.scientificamerican.com/sa-visual/2014/09/16/visualizing-4-dimensional-asteroids/), in Scientific American (by Jake VanderPlas) + +1. [Challenges and opportunities in understanding microbial communities with metagenome assembly](http://journal.frontiersin.org/article/10.3389/fmicb.2015.00678/abstract), [accompanied by IPython Notebook tutorial](http://nbviewer.ipython.org/github/germs-lab/frontiers-review-2015/blob/master/frontiers-nb-2015.ipynb), by [Adina Howe](http://germslab.org) and Patrick Chain. + +## Data-driven journalism + +* [The Need for Openness in Data Journalism](http://nbviewer.ipython.org/github/brianckeegan/Bechdel/blob/master/Bechdel_test.ipynb), by [Brian Keegan](http://www.brianckeegan.com). + +* [St. Louis County Segregation Analysis](https://github.com/BuzzFeedNews/2014-08-st-louis-county-segregation) , analysis for the article [The Ferguson Area Is Even More Segregated Than You Probably Guessed](http://www.buzzfeed.com/jsvine/the-ferguson-area-is-even-more-segregated-than-you-thought) by [Jeremy Singer-Vine](https://twitter.com/jsvine). + +## Whimsical notebooks + +* [XKCD-styled plots created with Matplotlib](http://nbviewer.ipython.org/url/jakevdp.github.com/downloads/notebooks/XKCD_plots.ipynb). Here is the [blog post version](http://jakevdp.github.com/blog/2012/10/07/xkcd-style-plots-in-matplotlib/) with discussion. By Jake van der Plas. + +* [Van Gogh's Starry Night with ipythonblocks](http://nbviewer.ipython.org/github/jiffyclub/ipythonblocks/blob/master/demos/starry_night_to_text.ipynb), part of Matt Davis' [ipythonblocks](https://github.com/jiffyclub/ipythonblocks). This is a teaching tool for use with the IPython notebook that provides visual elements to understand programming concepts. + +* [Conway's Game of Life](http://nbviewer.ipython.org/gist/3778422). Interesting use of convolution operation to calculate the next state of game board, instead of obvious find neighbors and filter the board for next state. + +* [pynguins](https://gist.github.com/denfromufa/9a5e1fdeaf611dc60ea8). Using jupyter notebook, python, and numpy to solve Board Game "Penguins on Ice". + +* ["People plots"](http://nbviewer.ipython.org/gist/4544012), stick figures generated with matplotlib. + +* [Reveal converter mini-tutorial](http://nbviewer.ipython.org/url/www.damian.oquanta.info/posts/reveal-converter-mini-tutorial.ipynb), also available in [blog post form](http://www.damian.oquanta.info/posts/reveal-converter-mini-tutorial.html). Do you want to make static html/css slideshow straight from the IPython notebook? OK, now you can do it with the *reveal* converter (nbconvert). [Demo](http://www.slideviper.oquanta.info/tutorial/slideshow_tutorial_slides.html) by Damián Avila. + +* [Personal IPython Weight Notebook] (http://nbviewer.ipython.org/gist/9769238). Plot your loss of weight with prognosis and motivation features. + +* [Streaming Double Pendulum Simulation in IPython NB](http://nbviewer.ipython.org/github/plotly/python-user-guide/blob/master/s7_streaming/s7_streaming.ipynb). + +* [Porque Charles Xavier debe cambiar a Cerebro por Python](http://nbviewer.ipython.org/github/mshopper/aurora/blob/master/Aurora.ipynb), a study in data and gender in the Marvel comics universe, by [Mai Giménez](http://twitter.com/adahopper) and [Angela Rivera](http://twitter.com/ghilbrae). + +* [Functional Geometry: a deconstruction of the MC Escher woodcut Square Limit](http://nbviewer.ipython.org/github/shashi/ijulia-notebooks/blob/master/funcgeo/Functional%20Geometry.ipynb), an IJulia notebook by [Shashi Gowda](http://shashi.github.io/). + +## Videos of IPython being used in the wild + +Of course the first thing you might try is searching for videos about IPython (1900 or so by last count on Youtube) but there are demonstrations of other applications using the power of IPython but are not mentioned is the descriptions. Below are a few such: + +* [Video](http://www.youtube.com/watch?v=Nc16qeGBtMU) on how to learn Python featuring IPython as the platform of choice for learning! + +* [This video](http://www.youtube.com/watch?v=4ONBVNm3isI) shows IPython being used in the [scikit-learn](http://scikit-learn.org) project + +* He doesn't show IPython in use but his IPython sticker is clear for the entire video: [Planning and Tending the Garden: The Future of Early Childhood Python Education](http://www.youtube.com/watch?v=op61s-QHryk) + +* [Wes McKinney's speech](http://youtu.be/qbYYamU42Sw?t=5m9s) on Python and data analysis features IPython as does his book [Python for Data Analysis](http://shop.oreilly.com/product/0636920023784.do) + +* [This video](https://www.youtube.com/watch?v=zG8FYPFU9n4) shows Plotly and IPython in use at a Montreal Python meetup. diff --git a/userguide/en/notebook/nb.md b/userguide/en/notebook/nb.md new file mode 100644 index 0000000..c54b68c --- /dev/null +++ b/userguide/en/notebook/nb.md @@ -0,0 +1,111 @@ +## Notebook Usage ## + +Open Dashboard, create Python3 Notebook by selecting ```New - Python3```, +will then enter Python3 Notebook interactive programming environment. +At the same time, an IPython Notebook file named `untitled.ipynb` is +generated for this interaction. + +This section will introduce a basic [Hello World](#hello-world), +a [Visualizing Machine learning result](#visualizing-machine-learning-result), +a [complete tutorial](#complete-tutorial). + +### Hello World ### + +In a new Cell, input Python3 code: + +```python +print("Hello World") +``` + +Run and get the result: + +``` +Hello World +``` + +### Visualizing Machine Learning Result + +Visualizing need the support of `matplotlib`. It should be loaded first. + +Input and execute the following code in a new Cell: + +```python +%matplotlib inline +``` + +or + +```python +%matplotlib notebook +``` + +If no error, the browser is now ready to draw graphs. + +Input and execute the sample machine learnign code in a new Cell: + +```python +# import package +import numpy as np +import matplotlib.pyplot as plt +from sklearn import linear_model, datasets + +# load data : we only use target==0 and target==1 (2 types classify) and feature 0 and feature 2 () +iris = datasets.load_iris() +X = iris.data[iris.target!=2][:, [0,2]] +Y = iris.target[iris.target!=2] + +h = .02 # step size in the mesh + +logreg = linear_model.LogisticRegression(C=1e5) +logreg.fit(X, Y) + +# Plot the decision boundary. For that, we will assign a color to each +# point in the mesh [x_min, m_max]x[y_min, y_max]. +x_min, x_max = X[:, 0].min() - .5, X[:, 0].max() + .5 +y_min, y_max = X[:, 1].min() - .5, X[:, 1].max() + .5 +xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h)) +Z = logreg.predict(np.c_[xx.ravel(), yy.ravel()]) + +# Put the result into a color plot +Z = Z.reshape(xx.shape) +#plt.figure(1, figsize=(4, 3)) +plt.pcolormesh(xx, yy, Z, cmap=plt.cm.Paired) +plt.xlabel('Sepal length') +plt.ylabel('Sepal width') + +# Plot also the training points +plt.scatter(X[:, 0], X[:, 1], c=Y, edgecolors='k', cmap=plt.cm.Paired) +plt.xlabel('Sepal length') +plt.ylabel('Sepal width') + +plt.xlim(xx.min(), xx.max()) +plt.ylim(yy.min(), yy.max()) +plt.xticks(()) +plt.yticks(()) + +plt.savefig("learn.svg") +``` + +Later on, you will see the output like: + + + +At the same time, a vector graph ```learn.svg``` is generated in the +current directory, which can be opened in the Dashboard. + +You can also open it directly here by input and execute + +``` +SVG("learn.svg") +``` + +### Complete Tutorial ### + +Many tasks can be done in IPython Notebook, some are interesting, for example, displaying kinds of JPG/PNG/SVG pictures, videos, HTML files, pdf files, loading external resources like an remote WEB page, even a Youtube video, showing LaTeX math formula, etc. + +The complete user guide and tutorials can be refered +[here](http://nbviewer.jupyter.org/github/ipython/ipython/blob/4.0.x/examples/IPython%20Kernel/Index.ipynb). + +[Jupyter nbviewer](http://nbviewer.jupyter.org) also shows many examples. + + diff --git a/userguide/en/notebook/python.md b/userguide/en/notebook/python.md new file mode 100644 index 0000000..a9f57fa --- /dev/null +++ b/userguide/en/notebook/python.md @@ -0,0 +1,821 @@ +## Python Packages ## + +Python3 packages in the Docklet base image
Python SoftwareLicenseSummaryNote
+ basemap + PSFPlot data on map projections with Matplotlib + Map +
+ bcolz + BSDColumnar and compressed data containers. + Data +
+ bcrypt + ApacheModern password hashing for your software and your servers + Crypt +
+ BeautifulSoup4 + MITSits atop an HTML or XML parser, providing Pythonic idioms for iterating, searching, and modifying the parse tree. + Web +
+ biopython + (as-is) BSD-likeTools for biological computation + Biological +
+ blaze + BSDData migration utilities + Data +
+ bokeh + BSDStatistical and novel interactive HTML plots for Python + Plot +
+ cartopy + LGPLA Python package designed to make drawing maps for data analysis and visualisation as easy as possible + Map +
+ chaco + BSDLibrary for generating interactive 2D plots + Plot +
+ cheetah + MITTemplate engine and code generation tool + Web +
+ coverage + BSDCode coverage measurement for Python + Dev +
+ cryptography + Apache 2.0cryptography is a package which provides cryptographic recipes and primitives to Python developers. + Crypt +
+ Cython + ApachePython-based language for writing C extensions for Python + Dev +
+ docutils + public-domain with exceptions (BSD, PSF)Documentation utilities + Doc +
+ flask + BSDA lightweight web application framework based on Werkzeug, Jinja2 and good intentions. + Web +
+ Flask-Babel + BSDAdds i18n/l10n support to Flask applications + Web +
+ gevent + MITPython network library using greenlet and libevent for easy and scalable concurrency + Sys +
+ gevent_websocket + Apache 2.0Websocket handler for the gevent pywsgi server, a Python network library + Network +
+ ggplot + BSDggplot for python + +
+ h5py + BSDPython interface to the HDF library + Data +
+ Jinja2 + BSDTemplate engine + Web +
+ jupyter + BSDJupyter metapackage. Install all the Jupyter components in one go. + Web +
+ lmfit + BSDLeast-Squares Minimization with Bounds and Constraints + Scipy +
+ lxml + BSDXML/XSLT library with bindings to libxml2/libxslt + Doc +
+ M2Crypto + MITM2Crypto: A Python crypto and SSL toolkit + Crypt +
+ matplotlib + PSFInteractive 2D plotting library + Plot +
+ MDP + BSDModular toolkit for Data Processing (MDP) + Data +
+ memory_profiler + BSDA module for monitoring memory usage of a python program + Dev +
+ msgpack + Apache 2.0Efficient binary serialization library + Dev +
+ multipledispatch + BSDMultiple dispatch in Python. + Dev +
+ networkx + BSDCreate, manipulate, and analyze graphs and networks + Graph +
+ nltk + Apache 2.0Suite of tools for symbolic and statistical natural language processing + NLP +
+ numexpr + MITFast evaluation of array expressions + Math +
+ numpy + BSDGeneral-purpose multi-dimensional array-processing and math library + Math +
+ openpyxl + MIT/ExpatRead and and write Excel OpenXML files + Excel +
+ pandas + BSDData manipulation and analysis library + Data +
+ paste + MITTools for using a Web Server Gateway Interface stack + Web +
+ patsy + BSDPackage for describing statistical models and building design matrices + Data +
+ pep8 + BSDPython style guide checker + Dev +
+ pexpect + ISC licensePexpect allows easy control of interactive console applications. + Sys +
+ PIL + MITImage processing library + Image +
+ plotly + MITPython plotting library for collaborative, interactive, publication-quality graphs. + Plot +
+ ply + BSDPython implementation of lex and yacc + Dev +
+ prettyplotlib + MITPainlessly create beautiful default matplotlib plots. + Plot +
+ psutil + BSDOS interface to processes and tasks + Sys +
+ py + MITLibrary with cross-python path, ini-parsing, io, code, log facilities + Sys +
+ pyephem + LGPLScientific-grade astronomy routines + Astronomy +
+ pyflakes + MITStatic analysis of Python code to detect common errors + Dev +
+ Pygments + BSDCode syntax highlighting package written in Python + Dev +
+ pyOpenSSL + Apache License 2Python bindings to the OpenSSL library + Network +
+ pysal + BSDA library of spatial analysis functions. + Scipy +
+ PyTables + BSDHierarchical datasets for extremely large data + Data +
+ pytest + MITSimple powerful testing with Python + Dev +
+ PyYAML + MITYAML parser and emitter + Doc +
+ pyzmq + BSD-likeBinding to the ZeroMQ fast messaging library + Middleware +
+ requests + Apache 2.0Elegant and simple HTTP library for Python + Web +
+ scikit_learn + BSDMachine learning and data mining routines + ML +
+ SciPy + BSDLibraries for mathematics, science, and engineering + Scipy +
+ SCons + MITPythonic substitute for Make + Dev +
+ seaborn + BSDSeaborn: statistical data visualization + Data +
+ setuptools + PSFEasily download, build, install, upgrade, and uninstall Python packages + Sys +
+ Shapely + BSDGeometric objects, predicates, and operations + Geo +
+ simplegeneric + ZPL 2.1Simple generic functions (similar to Python's own len(), pickle.dump(), etc.) + Dev +
+ Sphinx + BSDCreates intelligent and beautiful project documentation + Doc +
+ SQLAlchemy + MITSQL toolkit and Object Relational Mapper + Data +
+ statsmodels + BSDStatistical computations and models for use with SciPy + Scipy +
+ SymPy + BSDSymbolic mathematics library + Math +
+ Tornado + Apache 2.0Scalable, non-blocking web server + Web +
+ virtualenv + MITVirtual Python Environment builder + Sys +
+ werkzeug + BSDAdvanced WSGI utility modules + Web +
+ xarray + ApacheN-D labeled arrays and datasets in Python + Math +
+ xlrd + BSDExtract data from Microsoft Excel (tm) spreadsheet files + Excel +
+ xlsxwriter + BSDA Python module for creating Excel XLSX files. + Excel +
+ + diff --git a/userguide/en/notebook/r.md b/userguide/en/notebook/r.md new file mode 100644 index 0000000..8083809 --- /dev/null +++ b/userguide/en/notebook/r.md @@ -0,0 +1,1845 @@ +## R Packages ## + +### Recommended R Packages ### + +The article [Quick list of useful R packages](https://support.rstudio.com/hc/en-us/articles/201057987-Quick-list-of-useful-R-packages) gives a list of recommended R packages. + +

Many useful R function come in packages, free libraries of code written by R's active user community. To install an R package, open an R session and type at the command line

+
install.packages("<the package's name>")
+
+

R will download the package from CRAN, so you'll need to be connected to the internet. Once you have a package installed, you can make its contents available to use in your current R session by running

+
library("<the package's name>")
+
+

There are thousands of helpful R packages for you to use, but navigating them all can be a challenge. To help you out, we've compiled this guide to some of the best. We've used each of these, and found them to be outstanding – we've even written some of them. But you don't have to take our word for it, these packages are also some of the top most downloaded R packages.

+

To load data

+

RODBC, RMySQL, RPostgresSQL, RSQLite - If you'd like to read in data from a database, these packages are a good place to start. Choose the package that fits your type of database.

+

XLConnect, xlsx - These packages help you read and write Micorsoft Excel files from R. You can also just export your spreadsheets from Excel as .csv's.

+

foreign - Want to read a SAS data set into R? Or an SPSS data set? Foreign provides functions that help you load data files from other programs into R.

+

R can handle plain text files – no package required. Just use the functions read.csv, read.table, and read.fwf. If you have even more exotic data, consult the CRAN guide to data import and export.

+ +

To manipulate data

+ +

dplyr - Essential shortcuts for subsetting, summarizing, rearranging, and joining together data sets. dplyr is our go to package for fast data manipulation.

+

tidyr - Tools for changing the layout of your data sets. Use the gather and spread functions to convert your data into the tidy format, the layout R likes best.

+

stringr - Easy to learn tools for regular expressions and character strings.

+

lubridate - Tools that make working with dates and times easier.

+ +

To visualize data

+ +

ggplot2 - R's famous package for making beautiful graphics. ggplot2 lets you use the grammar of graphics to build layered, customizable plots.

+

ggvis - Interactive, web based graphics built with the grammar of graphics.

+

rgl - Interactive 3D visualizations with R

+

htmlwidgets - A fast way to build interactive (javascript based) visualizations with R. Packages that implement htmlwidgets include:

+ +

 

+

googleVis - Let's you use Google Chart tools to visualize data in R. Google Chart tools used to be called Gapminder, the graphing software Hans Rosling made famous in hie TED talk.

+ +

To model data

+ +

car - car's Anova function is popular for making type II and type III Anova tables.

+

mgcv - Generalized Additive Models

+

lme4/nlme - Linear and Non-linear mixed effects models

+

randomForest - Random forest methods from machine learning

+

multcomp - Tools for multiple comparison testing

+

vcd - Visualization tools and tests for categorical data

+

glmnet - Lasso and elastic-net regression methods with cross validation

+

survival - Tools for survival analysis

+

caret - Tools for training regression and classification models

+ +

To report results

+ +

shiny - Easily make interactive, web apps with R. A perfect way to explore data and share findings with non-programmers.

+

R Markdown - The perfect workflow for reproducible reporting. Write R code in your markdown reports. When you run render, R Markdown will replace the code with its results and then export your report as an HTML, pdf, or MS Word document, or a HTML or pdf slideshow. The result? Automated reporting. R Markdown is integrated straight into RStudio.

+

xtable - The xtable function takes an R object (like a data frame) and returns the latex or HTML code you need to paste a pretty version of the object into your documents. Copy and paste, or pair up with R Markdown.

+ +

For Spatial data

+ +

sp, maptools - Tools for loading and using spatial data including shapefiles.

+

maps - Easy to use map polygons for plots.

+

ggmap - Download street maps straight from Google maps and use them as a background in your ggplots.

+ +

For Time Series and Financial data

+ +

zoo - Provides the most popular format for saving time series objects in R.

+

xts - Very flexible tools for manipulating time series data sets.

+

quantmod - Tools for downloading financial data, plotting common charts, and doing technical analysis.

+ +

To write high performance R code

+ +

Rcpp - Write R functions that call C++ code for lightning fast speed.

+

data.table - An alternative way to organize data sets for very, very fast operations. Useful for big data.

+

parallel - Use parallel processing in R to speed up your code or to crunch large data sets.

+ +

To work with the web

+ +

XML - Read and create XML documents with R

+

jsonlite - Read and create JSON data tables with R

+

httr - A set of useful tools for working with http connections

+ +

To write your own R packages

+ +

devtools - An essential suite of tools for turning your code into an R package.

+

testthat - testthat provides an easy way to write unit tests for your code projects.

+

roxygen2 - A quick way to document your R packages. roxygen2 turns inline code comments into documentation pages and builds a package namespace.

+

You can also read about the entire package development process online in Hadley Wickham's R Packages book

+ + +### Top 100 R packages for 2013 (Jan-May) + +[This article](http://www.r-statistics.com/2013/06/top-100-r-packages-for-2013-jan-may/) gave a statistics report about the most downloaded R packages in
+


+
+

Package

+
+

Title

+
+

Downloads

+
+

1

+
+

plyr

+
+

Tools for splitting, applying and combining data

+
+

84049

+
+

2

+
+

digest

+
+

Create cryptographic hash digests of R objects

+
+

83192

+
+

3

+
+

ggplot2

+
+

An implementation of the Grammar of Graphics

+
+

82768

+
+

4

+
+

colorspace

+
+

Color Space Manipulation

+
+

81901

+
+

5

+
+

stringr

+
+

Make it easier to work with strings

+
+

77658

+
+

6

+
+

RColorBrewer

+
+

ColorBrewer palettes

+
+

66783

+
+

7

+
+

reshape2

+
+

Flexibly reshape data: a reboot of the reshape package

+
+

64911

+
+

8

+
+

zoo

+
+

S3 Infrastructure for Regular and Irregular Time Series (Z’s

+

ordered observations)

+
+

60844

+
+

9

+
+

proto

+
+

Prototype object-based programming

+
+

59043

+
+

10

+
+

scales

+
+

Scale functions for graphics

+
+

58369

+
+

11

+
+

car

+
+

Companion to Applied Regression

+
+

57453

+
+

12

+
+

dichromat

+
+

Color Schemes for Dichromats

+
+

56624

+
+

13

+
+

gtable

+
+

Arrange grobs in tables

+
+

54431

+
+

14

+
+

munsell

+
+

Munsell colour system

+
+

53183

+
+

15

+
+

labeling

+
+

Axis Labeling

+
+

51877

+
+

16

+
+

Hmisc

+
+

Harrell Miscellaneous

+
+

47836

+
+

17

+
+

rJava

+
+

Low-level R to Java interface

+
+

47731

+
+

18

+
+

mvtnorm

+
+

Multivariate Normal and t Distributions

+
+

46884

+
+

19

+
+

bitops

+
+

Bitwise Operations

+
+

45689

+
+

20

+
+

rgl

+
+

3D visualization device system (OpenGL)

+
+

41001

+
+

21

+
+

foreign

+
+

Read Data Stored by Minitab, S, SAS, SPSS, Stata, Systat, dBase,

+

..

+
+

37849

+
+

22

+
+

XML

+
+

Tools for parsing and generating XML within R and S-Plus

+
+

37153

+
+

23

+
+

lattice

+
+

Lattice Graphics

+
+

36597

+
+

24

+
+

e1071

+
+

Misc Functions of the Department of Statistics (e1071), TU Wien

+
+

35180

+
+

25

+
+

gtools

+
+

Various R programming tools

+
+

35028

+
+

26

+
+

sp

+
+

classes and methods for spatial data

+
+

34786

+
+

27

+
+

gdata

+
+

Various R programming tools for data manipulation

+
+

34262

+
+

28

+
+

Rcpp

+
+

Seamless R and C++ Integration

+
+

33929

+
+

29

+
+

MASS

+
+

Support Functions and Datasets for Venables and Ripley’s MASS

+
+

33667

+
+

30

+
+

Matrix

+
+

Sparse and Dense Matrix Classes and Methods

+
+

30740

+
+

31

+
+

lmtest

+
+

Testing Linear Regression Models

+
+

30319

+
+

32

+
+

survival

+
+

Survival Analysis

+
+

30186

+
+

33

+
+

caTools

+
+

Tools: moving window statistics, GIF, Base64, ROC AUC, etc

+
+

29945

+
+

34

+
+

multcomp

+
+

Simultaneous Inference in General Parametric Models

+
+

29871

+
+

35

+
+

RCurl

+
+

General network (HTTP/FTP/…) client interface for R

+
+

28866

+
+

36

+
+

knitr

+
+

A general-purpose package for dynamic report generation in R

+
+

28104

+
+

37

+
+

xtable

+
+

Export tables to LaTeX or HTML

+
+

28091

+
+

38

+
+

xts

+
+

eXtensible Time Series

+
+

28058

+
+

39

+
+

rpart

+
+

Recursive Partitioning

+
+

27812

+
+

40

+
+

evaluate

+
+

Parsing and evaluation tools that provide more details than the

+

default

+
+

27617

+
+

41

+
+

RODBC

+
+

ODBC Database Access

+
+

26131

+
+

42

+
+

quadprog

+
+

Functions to solve Quadratic Programming Problems

+
+

25433

+
+

43

+
+

tseries

+
+

Time series analysis and computational finance

+
+

25144

+
+

44

+
+

DBI

+
+

R Database Interface

+
+

24793

+
+

45

+
+

nlme

+
+

Linear and Nonlinear Mixed Effects Models

+
+

24360

+
+

46

+
+

lme4

+
+

Linear mixed-effects models using S4 classes

+
+

24199

+
+

47

+
+

reshape

+
+

Flexibly reshape data

+
+

24118

+
+

48

+
+

sandwich

+
+

Robust Covariance Matrix Estimators

+
+

24016

+
+

49

+
+

leaps

+
+

regression subset selection

+
+

23666

+
+

50

+
+

gplots

+
+

Various R programming tools for plotting data

+
+

23251

+
+

51

+
+

abind

+
+

Combine multi-dimensional arrays

+
+

22758

+
+

52

+
+

randomForest

+
+

Breiman and Cutler’s random forests for classification and

+

regression

+
+

22401

+
+

53

+
+

Rcmdr

+
+

R Commander

+
+

22131

+
+

54

+
+

coda

+
+

Output analysis and diagnostics for MCMC

+
+

21900

+
+

55

+
+

maps

+
+

Draw Geographical Maps

+
+

21550

+
+

56

+
+

igraph

+
+

Network analysis and visualization

+
+

21423

+
+

57

+
+

formatR

+
+

Format R Code Automatically

+
+

21049

+
+

58

+
+

maptools

+
+

Tools for reading and handling spatial objects

+
+

20957

+
+

59

+
+

RSQLite

+
+

SQLite interface for R

+
+

19671

+
+

60

+
+

psych

+
+

Procedures for Psychological, Psychometric, and Personality

+

Research

+
+

19545

+
+

61

+
+

KernSmooth

+
+

Functions for kernel smoothing for Wand &amp Jones (1995)

+
+

19166

+
+

62

+
+

rgdal

+
+

Bindings for the Geospatial Data Abstraction Library

+
+

19064

+
+

63

+
+

RcppArmadillo

+
+

Rcpp integration for Armadillo templated linear algebra library

+
+

18899

+
+

64

+
+

effects

+
+

Effect Displays for Linear, Generalized Linear,

+

Multinomial-Logit, Proportional-Odds Logit Models and

+

Mixed-Effects Models

+
+

18843

+
+

65

+
+

sem

+
+

Structural Equation Models

+
+

18711

+
+

66

+
+

vcd

+
+

Visualizing Categorical Data

+
+

18589

+
+

67

+
+

XLConnect

+
+

Excel Connector for R

+
+

18230

+
+

68

+
+

markdown

+
+

Markdown rendering for R

+
+

18211

+
+

69

+
+

timeSeries

+
+

Rmetrics – Financial Time Series Objects

+
+

17932

+
+

70

+
+

timeDate

+
+

Rmetrics – Chronological and Calendar Objects

+
+

17838

+
+

71

+
+

RJSONIO

+
+

Serialize R objects to JSON, JavaScript Object Notation

+
+

17801

+
+

72

+
+

cluster

+
+

Cluster Analysis Extended Rousseeuw et al

+
+

17136

+
+

73

+
+

scatterplot3d

+
+

3D Scatter Plot

+
+

17110

+
+

74

+
+

nnet

+
+

Feed-forward Neural Networks and Multinomial Log-Linear Models

+
+

17074

+
+

75

+
+

fBasics

+
+

Rmetrics – Markets and Basic Statistics

+
+

16278

+
+

76

+
+

forecast

+
+

Forecasting functions for time series and linear models

+
+

15638

+
+

77

+
+

quantreg

+
+

Quantile Regression

+
+

15509

+
+

78

+
+

foreach

+
+

Foreach looping construct for R

+
+

15405

+
+

79

+
+

chron

+
+

Chronological objects which can handle dates and times

+
+

15226

+
+

80

+
+

plotrix

+
+

Various plotting functions

+
+

15142

+
+

81

+
+

matrixcalc

+
+

Collection of functions for matrix calculations

+
+

15107

+
+

82

+
+

aplpack

+
+

Another Plot PACKage: stem.leaf, bagplot, faces, spin3R, and

+

some slider functions

+
+

14654

+
+

83

+
+

strucchange

+
+

Testing, Monitoring, and Dating Structural Changes

+
+

14503

+
+

84

+
+

iterators

+
+

Iterator construct for R

+
+

14449

+
+

85

+
+

mgcv

+
+

Mixed GAM Computation Vehicle with GCV/AIC/REML smoothness

+

estimation

+
+

14186

+
+

86

+
+

kernlab

+
+

Kernel-based Machine Learning Lab

+
+

14135

+
+

87

+
+

SparseM

+
+

Sparse Linear Algebra

+
+

13921

+
+

88

+
+

tree

+
+

Classification and regression trees

+
+

13871

+
+

89

+
+

robustbase

+
+

Basic Robust Statistics

+
+

13778

+
+

90

+
+

vegan

+
+

Community Ecology Package

+
+

13686

+
+

91

+
+

devtools

+
+

Tools to make developing R code easier

+
+

13488

+
+

92

+
+

latticeExtra

+
+

Extra Graphical Utilities Based on Lattice

+
+

13253

+
+

93

+
+

modeltools

+
+

Tools and Classes for Statistical Models

+
+

13233

+
+

94

+
+

xlsx

+
+

Read, write, format Excel 2007 and Excel 97/2000/XP/2003 files

+
+

13097

+
+

95

+
+

slam

+
+

Sparse Lightweight Arrays and Matrices

+
+

13060

+
+

96

+
+

TTR

+
+

Technical Trading Rules

+
+

12894

+
+

97

+
+

quantmod

+
+

Quantitative Financial Modelling Framework

+
+

12892

+
+

98

+
+

relimp

+
+

Relative Contribution of Effects in a Regression Model

+
+

12692

+
+

99

+
+

akima

+
+

Interpolation of irregularly spaced data

+
+

12680

+
+

100

+
+

memoise

+
+

Memoise functions

+
+

12600

+
+ +### R Packages in Docklet ### + +The Docklet base image provides the `r-base`,`r-recommended`,`r-cran-*`, i.e., +all available R packages in Ubuntu apt sources, + +Here is the list: + + * r-base - GNU R statistical computation and graphics system + * r-base-core - GNU R core of statistical computation and graphics system + * r-base-core-dbg - GNU R debug symbols for statistical comp. language and environment + * r-base-dev - GNU R installation of auxiliary GNU R packages + * r-base-html - GNU R html docs for statistical computing system functions + * r-bioc-qvalue - GNU R package for Q-value estimation for FDR control + * r-cran-abind - GNU R abind multi-dimensional array combination function + * r-cran-acepack - GNU R package for regression transformations + * r-cran-ade4 - GNU R analysis of ecological data + * r-cran-afex - GNU R package for analyzing factorial experiments using ANOVA or mixed models + * r-cran-amelia - GNU R package supporting multiple imputation of missing data + * r-cran-amore - GNU R: A MORE flexible neural network package + * r-cran-ape - GNU R package for Analyses of Phylogenetics and Evolution + * r-cran-base64enc - GNU R package that provides tools for base64 encoding + * r-cran-batchjobs - GNU R batch computing + * r-cran-bayesfactor - GNU R package providing Bayes factors for t-tests, ANOVAs and contingency tables + * r-cran-bayesm - GNU R package for Bayesian inference + * r-cran-bbmisc - GNU R Miscellaneous helper functions for B. Bischl + * r-cran-beeswarm - bee swarm plot, an alternative to stripchart + * r-cran-biasedurn - GNU R Biased Urn model distributions + * r-cran-bio3d - GNU R package for biological structure analysis + * r-cran-bitops - GNU R package implementing bitwise operations + * r-cran-blockmodeling - Generalized and classical blockmodeling of valued networks + * r-cran-bms - GNU R package for Bayesian model averaging for linear models + * r-cran-boolnet - assembling, analyzing and visualizing Boolean networks + * r-cran-boot - GNU R package for bootstrapping functions from Davison and Hinkley + * r-cran-bradleyterry2 - GNU R package for using Bradley-Terry models + * r-cran-brew - GNU R templating framework for report generation + * r-cran-brglm - GNU R package for bias reduction in binomial-response GLMs + * r-cran-cairodevice - GNU R Cairo/Gtk2 device driver package + * r-cran-car - GNU R Companion to Applied Regression by John Fox + * r-cran-caret - GNU R package for classification and regression training + * r-cran-catools - GNU R package providing various utility functions + * r-cran-checkmate - GNU R fast and versatile argument checks + * r-cran-chron - GNU R package for chronologically ordered objects + * r-cran-class - GNU R package for classification + * r-cran-cluster - GNU R package for cluster analysis by Rousseeuw et al + * r-cran-cmprsk - GNU R subdistribution analysis of competing risks + * r-cran-coda - Output analysis and diagnostics for MCMC simulations in R + * r-cran-codetools - GNU R package providing code analysis tools + * r-cran-coin - GNU R package providing conditional inference procedures + * r-cran-colorspace - GNU R Color Space Manipulation + * r-cran-combinat - GNU R package with utilities for combinatorics + * r-cran-contfrac - GNU R package providing various utilities for evaluating continued fractions + * r-cran-conting - GNU R package for Bayesian analysis of contingency tables + * r-cran-crayon - GNU R colored terminal output + * r-cran-cubature - GNU R package for adaptive multivariate integration + * r-cran-date - GNU R package for date handling + * r-cran-dbi - GNU R package providing a generic database interface + * r-cran-deal - Learning Bayesian Networks with Mixed Variables + * r-cran-deldir - GNU R Delaunay Triangulation and Dirichlet (Voronoi) Tessellation + * r-cran-deoptimr - GNU R package for Diffential Evolution in pure R + * r-cran-desolve - GNU R package providing functions that solve initial value problems + * r-cran-diagnosismed - medical diagnostic test accuracy analysis toolkit + * r-cran-dichromat - Color schemes for dichromats + * r-cran-digest - GNU R package for 'hash digest' of R data structures + * r-cran-domc - GNU R parallel excution backend for %dopar% using multicore + * r-cran-doparallel - GNU R foreach parallel adaptor for the parallel package + * r-cran-dosefinding - Planning and Analyzing Dose Finding experiments + * r-cran-dosnow - GNU R parallel excution backend for %dopar% using snow + * r-cran-downloader - GNU R package for downloading files over http and https + * r-cran-e1071 - GNU R package with miscellaneous functions of the Dept of Statisics (e1071) + * r-cran-eco - GNU R routines for Bayesian ecological inference + * r-cran-ecodist - GNU R package for dissimilarity-based ecological analysis + * r-cran-effects - GNU R graphical and tabular effects display for glm models + * r-cran-elliptic - GNU R package providing elliptic and related functions + * r-cran-energy - GNU R package for energy statistics for distribution comparison + * r-cran-epi - GNU R epidemiological analysis + * r-cran-epibasix - GNU R Elementary Epidemiological Functions + * r-cran-epicalc - GNU R Epidemiological calculator + * r-cran-epir - GNU R Functions for analysing epidemiological data + * r-cran-epitools - GNU R Epidemiology Tools for Data and Graphics + * r-cran-erm - GNU R package for 'extended Rasch modelling' + * r-cran-estimability - GNU R package providing tools for determining estimability of linear functions + * r-cran-etm - GNU R empirical transition matrix + * r-cran-evaluate - GNU R parsing and evaluation tools + * r-cran-evd - GNU R Functions for extreme value distributions + * r-cran-expm - GNU R Computation of the matrix exponential and related quantities + * r-cran-fail - GNU R File Abstraction Interface Layer (FAIL) mimicking a key-value store + * r-cran-fasianoptions - GNU R package for financial engineering -- fAsianOptions + * r-cran-fassets - GNU R package for financial engineering -- fAssets + * r-cran-fastcluster - Fast hierarchical clustering routines for GNU R + * r-cran-fastmatch - GNU R package for fast match replacement for repeated look-ups + * r-cran-fbasics - GNU R package for financial engineering -- fBasics + * r-cran-fbonds - GNU R package for financial engineering -- fBonds + * r-cran-fcopulae - GNU R package for financial engineering -- fCopulae + * r-cran-fexoticoptions - GNU R package for financial engineering -- fExoticOptions + * r-cran-fextremes - GNU R package for financial engineering -- fExtremes + * r-cran-fgarch - GNU R package for financial engineering -- fGarch + * r-cran-fimport - GNU R package for financial engineering -- fImport + * r-cran-fmultivar - GNU R package for financial engineering -- fMultivar + * r-cran-fnonlinear - GNU R package for financial engineering -- fNonlinear + * r-cran-foptions - GNU R package for financial engineering -- fOptions + * r-cran-foreach - GNU R foreach looping support + * r-cran-foreign - GNU R package to read/write data from other stat. systems + * r-cran-formatr - Format R code automatically + * r-cran-formula - GNU R package for extended model formulas + * r-cran-fportfolio - GNU R package for financial engineering -- fPortfolio + * r-cran-fregression - GNU R package for financial engineering -- fRegression + * r-cran-ftrading - GNU R package for financial engineering -- fTrading + * r-cran-funitroots - GNU R package for financial engineering -- fUnitRoots + * r-cran-futile.logger - logging utility for GNU R + * r-cran-futile.options - GNU R futile options management + * r-cran-g.data - GNU R package for delayed-data + * r-cran-gam - Generalized Additive Models for R + * r-cran-gbm - GNU R package "Generalized Boosted Regression Models" + * r-cran-gdata - GNU R package with data manipulation tools by Greg Warnes et al + * r-cran-geepack - Generalized Estimating Equation Package for R + * r-cran-genabel - GNU R package for genome-wide SNP association analysis + * r-cran-genabel.data - data package for genome-wide SNP association analysis + * r-cran-genetics - GNU R package for population genetics + * r-cran-getopt - GNU R package providing command-line parsing functionality + * r-cran-ggplot2 - implementation of the Grammar of Graphics + * r-cran-gmaps - GNU R support for producing geographic maps with grid graphics + * r-cran-gmodels - GNU R package with tools for model fitting by Greg Warnes et al + * r-cran-gnm - GNU R package for generalized nonlinear models + * r-cran-gplots - GNU R package with tools for plotting data by Greg Warnes et al + * r-cran-gregmisc - GNU R package with miscellaneous functions by Greg Warnes et al + * r-cran-gridextra - GNU R package with extensions for the grid package + * r-cran-gsl - GNU R wrapper for the GNU Scientific Library + * r-cran-gss - GNU R package for multivariate estimation using smoothing splines + * r-cran-gtable - Arrange grobs in tables + * r-cran-gtools - GNU R package with R programming tools by Greg Warnes et al + * r-cran-haplo.stats - GNU R package for haplotype analysis + * r-cran-hdf5 - GNU R package interfacing the NCSA HDF5 library + * r-cran-hmisc - GNU R miscellaneous functions by Frank Harrell + * r-cran-hwriter - HTML Writer - Outputs R objects in HTML format + * r-cran-hypergeo - GNU R package providing the Gaussian hypergeometric for complex numbers + * r-cran-igraph - GNU R network analysis and visualization + * r-cran-inline - GNU R package to inline C, C++, Fortran functions from R + * r-cran-int64 - GNU R package for 64 bit integer types + * r-cran-iterators - GNU R iterator support for vectors, lists and other containers + * r-cran-its - GNU R package for handling irregular time series + * r-cran-jsonlite - Robust, High Performance JSON Parser and Generator for R + * r-cran-kernlab - GNU R package for kernel-based machine learning lab + * r-cran-kernsmooth - GNU R package for kernel smoothing and density estimation + * r-cran-labeling - GNU R Axis Labeling optimization + * r-cran-lambda.r - GNU R modeling data with functional programming + * r-cran-lattice - GNU R package for 'Trellis' graphics + * r-cran-latticeextra - GNU R package of additional graphical displays based on lattice + * r-cran-learnbayes - GNU R functions for learning bayesian inference + * r-cran-lhs - GNU R package "lhs: Latin Hypercube Samples" + * r-cran-littler - GNU R scripting and command-line front-end + * r-cran-lme4 - GNU R package for linear mixed effects model fitting + * r-cran-lmtest - GNU R package for diagnostic checking in linear models + * r-cran-logspline - GNU R package providing routines for the logspline density estimation + * r-cran-lpsolve - GNU R package providing linear program solvers + * r-cran-lsmeans - GNU R package providing least-squares means for linear, generalized linear, and mixed models + * r-cran-magrittr - GNU R forward-pipe operator + * r-cran-maldiquant - GNU R package for quantitative analysis of mass spectrometry data + * r-cran-maldiquantforeign - GNU R package providing import/export routines for MALDIquant + * r-cran-mapdata - GNU R support for producing geographic maps (supplemental data) + * r-cran-mapproj - GNU R support for cartographic projections of map data + * r-cran-maps - GNU R support for producing geographic maps + * r-cran-mass - GNU R package of Venables and Ripley's MASS + * r-cran-matchit - GNU R package of nonparametric matching methods + * r-cran-matrix - GNU R package of classes for dense and sparse matrices + * r-cran-matrixmodels - GNU R package for sparse and dense matrix models + * r-cran-matrixstats - GNU R methods that apply to rows and columns of a matrix + * r-cran-maxlik - GNU R maximum likelihood estimation + * r-cran-mcmcpack - R routines for Markov chain Monte Carlo model estimation + * r-cran-medadherence - GNU R Medication Adherence: Commonly Used Definitions + * r-cran-memoise - Memoise functions + * r-cran-mfilter - GNU R package providing miscellaneous time series filters + * r-cran-mgcv - GNU R package for multiple parameter smoothing estimation + * r-cran-mime - R package which maps filenames to MIME Types + * r-cran-minqa - GNU R package for quadratic optimisation without derivatives + * r-cran-misc3d - GNU R collection of 3d plot functions and rgl-based isosurfaces + * r-cran-misctools - GNU R miscellaneous tools and utilities + * r-cran-mixtools - GNU R tools for analyzing finite mixture models + * r-cran-mnormt - GNU R package providing multivariate normal and t distribution + * r-cran-mnp - GNU R package for fitting multinomial probit (MNP) models + * r-cran-modeltools - GNU R package providing a collection of tools to deal with statistical models + * r-cran-msm - GNU R Multi-state Markov and hidden Markov models in continuous time + * r-cran-multcomp - GNU R package for multiple comparison procedures + * r-cran-multicore - GNU R parallel processing on multi-core or multi-cpu machines + * r-cran-munsell - Munsell colour system + * r-cran-mvnormtest - GNU R package for multivariate normality test + * r-cran-mvtnorm - GNU R package to compute multivariate Normal and T distributions + * r-cran-ncdf4 - GNU R interface to Unidata netCDF format data files + * r-cran-nlme - GNU R package for (non-)linear mixed effects models + * r-cran-nloptr - GNU R package for interface to NLopt + * r-cran-nnet - GNU R package for feed-forward neural networks + * r-cran-nnls - GNU R package for non-negative least squares (the Lawson-Hanson algorithm) + * r-cran-numderiv - GNU R package for accurate numerical derivatives + * r-cran-nws - GNU R package for distributed programming via NetWorkSpaces + * r-cran-optparse - GNU/R Command line option parser + * r-cran-pbapply - GNU R package providing progress bars for vectorized R functions + * r-cran-pbivnorm - GNU R package for calculating probabilities from a bivariate normal CDF + * r-cran-pbkrtest - GNU R package for tests in linear mixed-effect models + * r-cran-permute - R functions for generating restricted permutations of data + * r-cran-phangorn - GNU R package for phylogenetic analysis + * r-cran-pkgkitten - GNU R package to create simple packages + * r-cran-plotrix - GNU R package providing various plotting functions + * r-cran-plyr - tools for splitting, applying and combining data + * r-cran-polspline - GNU R package providing polynomial spline fitting + * r-cran-polyclip - GNU R Polygon Clipping + * r-cran-polycub - GNU R Cubature over Polygonal Domains + * r-cran-praise - GNU R praise users + * r-cran-profilemodel - GNU R tools for profiling inference functions + * r-cran-proto - Prototype object-based programming + * r-cran-pscl - GNU R package for discrete data models + * r-cran-psy - GNU R procedures for psychometrics + * r-cran-pvclust - Hierarchical Clustering with P-Values via Multiscale Bootstrap + * r-cran-pwt - GNU R package for the Penn World Tables (version 5.6 to 7.1) + * r-cran-pwt8 - GNU R package for the Penn World Tables (version 8) + * r-cran-qtl - GNU R package for genetic marker linkage analysis + * r-cran-quadprog - GNU R package for solving quadratic programming problems + * r-cran-quantreg - GNU R package for quantile regression + * r-cran-qvcalc - GNU R package for computing quasi variances for factor effects + * r-cran-r.methodss3 - GNU R utility function for defining S3 methods + * r-cran-r.oo - GNU R object-oriented programming with or without references + * r-cran-r.utils - GNU R various programming utilities + * r-cran-randomfields - GNU R simulation and analysis of random fields + * r-cran-randomforest - GNU R package implementing the random forest classificator + * r-cran-raschsampler - GNU R package for sampling binary matrices with fixed margins + * r-cran-rcmdr - GNU R platform-independent basic-statistics GUI + * r-cran-rcmdrmisc - GNU R package for miscellaneous Rcmdr utilities + * r-cran-rcolorbrewer - GNU R package providing suitable color palettes + * r-cran-rcpp - GNU R package for Seamless R and C++ Integration + * r-cran-rcpparmadillo - GNU R package for Armadillo C++ linear algebra library + * r-cran-rcppeigen - GNU R package for Eigen templated linear algebra + * r-cran-rcurl - GNU R General network (HTTP/FTP/...) client interface + * r-cran-readbrukerflexdata - GNU R package to read Bruker Daltonics \*flex format files + * r-cran-readmzxmldata - GNU R package to read mass spectrometry data in mzXML format + * r-cran-readxl - GNU R package to read Excel files + * r-cran-relimp - GNU R package for inference on relative importance of regressors + * r-cran-reshape - Flexibly reshape data + * r-cran-reshape2 - Flexibly reshape data: a reboot of the reshape package + * r-cran-rggobi - GNU R package for the GGobi data visualization system + * r-cran-rgl - GNU R package for three-dimensional visualisation using OpenGL + * r-cran-rglpk - GNU R interface to the GNU Linear Programing Kit + * r-cran-rgtk2 - GNU R binding for Gtk2 + * r-cran-rinside - GNU R package to embed R in C++ application + * r-cran-rjags - R interface to the JAGS Bayesian statistics package + * r-cran-rjava - GNU R low-level interface to Java + * r-cran-rjson - GNU R package for converting between R and JSON objects + * r-cran-rmpi - GNU R package interfacing MPI libraries for distributed computing + * r-cran-rms - GNU R regression modeling strategies by Frank Harrell + * r-cran-rmysql - GNU R package providing a DBI-compliant interface to MySQL + * r-cran-rneos - GNU R package with XML-RPC interface to NEOS + * r-cran-rnetcdf - GNU R package that provides an R interface to NetCDF datasets + * r-cran-rniftilib - GNU/R interface to NIFTICLIB + * r-cran-robustbase - GNU R package providing basic robust statistics + * r-cran-rocr - GNU R package to prepare and display ROC curves + * r-cran-rodbc - GNU R package for ODBC database access + * r-cran-rpart - GNU R package for recursive partitioning and regression trees + * r-cran-rpostgresql - GNU R package providing database interface and driver for PostgreSQL + * r-cran-rquantlib - GNU R package interfacing the QuantLib finance library + * r-cran-rsclient - GNU R package providing an Rserve client + * r-cran-rsdmx - GNU R package for the Statistical Data and Metadata Exchange (SDMX) framework + * r-cran-rserve - GNU R Rserve tcp/ip server and sample clients + * r-cran-rsolnp - GNU R general non-linear optimization + * r-cran-rsprng - GNU R interface to SPRNG (Scalable Parallel RNGs) + * r-cran-rsqlite - Database Interface R driver for SQLite + * r-cran-rsymphony - GNU R interface to the SYMPHONY MILP solver + * r-cran-runit - GNU R package providing unit testing framework + * r-cran-sandwich - GNU R package for model-robust standard error estimates + * r-cran-scales - Scale functions for graphics + * r-cran-scatterplot3d - GNU R package for Visualizing Multivariate Data + * r-cran-segmented - GNU R segmented relationships in regression models + * r-cran-sendmailr - send email using GNU R + * r-cran-seqinr - GNU R biological sequences retrieval and analysis + * r-cran-seroincidence - GNU R seroincidence calculator tool + * r-cran-slam - GNU R sparse lighweight arrays and matrices package + * r-cran-sm - GNU R package for kernel smoothing methods + * r-cran-sn - GNU R package providing skew-normal and skew-t distributions + * r-cran-snow - GNU R package for 'simple network of workstations' + * r-cran-sp - GNU R classes and methods for spatial data + * r-cran-spam - GNU R functions for sparse matrix algebra + * r-cran-sparsem - GNU R package for basic linear algebra for sparse matrices + * r-cran-spatial - GNU R package for spatial statistics + * r-cran-spatstat - GNU R Spatial Point Pattern analysis, model-fitting, simulation, tests + * r-cran-spc - GNU R Statistical Process Control + * r-cran-stabledist - GNU R package for stable distribution functions + * r-cran-statmod - GNU R package providing algorithms and functions for statistical modeling + * r-cran-stringi - GNU R character string processing facilities + * r-cran-stringr - Make it easier to work with strings + * r-cran-strucchange - GNU R package for structural change regression estimation + * r-cran-survival - GNU R package for survival analysis + * r-cran-tcltk2 - GNU R package for Tcl/Tk additions + * r-cran-teachingdemos - GNU R Demonstrations for teaching and learning + * r-cran-tensor - GNU R Tensor product of arrays + * r-cran-testthat - GNU R testsuite + * r-cran-tgp - GNU R package "tgp: Bayesian treed Gaussian process models" + * r-cran-th.data - GNU R package for datasets by Torsten Hothorn + * r-cran-timedate - GNU R package for financial engineering -- timeDate + * r-cran-timeseries - GNU R package for financial engineering -- timeSeries + * r-cran-tkrplot - GNU R embedded Tk plotting device package + * r-cran-truncnorm - GNU R truncated normal distribution + * r-cran-tseries - GNU R package for time-series analysis and comp. finance + * r-cran-urca - GNU R package providing unit root and cointegration tests + * r-cran-vcd - GNU R Visualizing Categorical Data + * r-cran-vcdextra - GNU R package providing extensions and additions to the vcd package + * r-cran-vegan - Community Ecology Package for R + * r-cran-vgam - GNU R package for estimating vector generalized additive models + * r-cran-vioplot - GNU R toolbox for violin plots + * r-cran-wdi - GNU R package for accessing the World Development Indicators + * r-cran-xml - GNU R package for XML parsing and generation + * r-cran-xtable - GNU R coerce data to LaTeX and HTML tables + * r-cran-xts - GNU R package for time series analysis -- xts + * r-cran-yaml - Methods to convert R data to YAML and back + * r-cran-zelig - GNU R package providing a unified front-end for estimating statistical models + * r-cran-zoo - GNU R package for totally ordered indexed observations + * r-doc-html - GNU R html manuals for statistical computing system + * r-doc-info - GNU R info manuals statistical computing system + * r-doc-pdf - GNU R pdf manuals for statistical computing system + * r-mathlib - GNU R standalone mathematics library + * r-recommended - GNU R collection of recommended packages [metapackage] + * r-cran-maptools - GNU R Tools for reading and handling spatial objects + * r-cran-surveillance - GNU R package for the Modeling and Monitoring of Epidemic Phenomena + + diff --git a/userguide/en/portal/README.md b/userguide/en/portal/README.md new file mode 100644 index 0000000..1139af8 --- /dev/null +++ b/userguide/en/portal/README.md @@ -0,0 +1,10 @@ +## Docklet Portal ## + +The web pages of Docklet portal include: + +* [Dashboard](dashboard.md) +* [Config](config.md) +* [Status](status.md) +* [Hosts](hosts.md) +* [Users](users.md) +* [Admin](admin.md) diff --git a/userguide/en/portal/admin.md b/userguide/en/portal/admin.md new file mode 100644 index 0000000..091d49a --- /dev/null +++ b/userguide/en/portal/admin.md @@ -0,0 +1,4 @@ +## Admin ## + +This webpage requires *admin* role to access. + diff --git a/userguide/en/portal/config.md b/userguide/en/portal/config.md new file mode 100644 index 0000000..6c5b17b --- /dev/null +++ b/userguide/en/portal/config.md @@ -0,0 +1,22 @@ +## Config ## + +Users will configure their Workspace here, mainly the management of +vclusters and images. + +The following picture shows the Config webpage of **tee** Workspace: + +workspace
+config + +Users can add nodes to or remove nodes from the vcluster. The above +picture show the *tee* Workspace having two nodes, with the detailed +information about node's name, IP address, status and image created +from, etc. + +Click **Save** to save the Workspace as a image named *myimage*, which +can be used later on. Now the image list is shown as: + +image config + +Users can also **Share** their saved private images with others, to help +them avoid tedious installation and configuration stuff. diff --git a/userguide/en/portal/dashboard.md b/userguide/en/portal/dashboard.md new file mode 100644 index 0000000..9065aac --- /dev/null +++ b/userguide/en/portal/dashboard.md @@ -0,0 +1,54 @@ +## Dashboard ## + +Dashboard is where users manage their Workspaces. Users can view +, create, start, stop, delete and enter their Workspaces. + +Initially the Workspace list is empty. Users need to create a Workspace +first. Click on **Add Workspace** will enter the web page for creating +Workspace, then select the image, and create a Workspace named **tee**, as +illustrated as follows: + +add 
+workspace + +If the creation succeed, the Workspace list changed to: + +dashboard
+stopped status + +We can see a Workspace named **tee** has been created, with status **Stopped**. +Now the user can choose to **Start** or **Delete** the Workspace. + +The *Stopped* Workspace still exists and wait for next start. Users are +encouraged to stop their Workspace when not running jobs to reduce +resource consumption. + +Now click **Start** to start the Workspace. The Workspace status changed +to: + +dashboard
+running status + +The Workspace status is **Running**. Users can choose to **Stop** the +Workspace, or **Go** into the Workspace. They can also click the name +*tee* to [Config](config.md) the Workspace. If click on the status +**Running**, will show the detailed vcluster [Status](status.md). + +**Note**: + +1. If the users click **Go** soon after creation of a Workspace, +they may get an error message of `503: Proxy Target Missing`, this is possibly +because of the Workspace backend initialization not finished. Click +**Go** some time later, users will enter the Workspace as expected. + +2. If the users click **Go**, but get `Not Found` error message: + +``` +Not Found + +The requested URL was not found on the server. If you entered the URL +manually please check your spelling and try again. +``` + +It is possibily because the system has experiened a recovery. Users need to +manually reboot their Workspaces to gain access. diff --git a/userguide/en/portal/hosts.md b/userguide/en/portal/hosts.md new file mode 100644 index 0000000..fc8bdc9 --- /dev/null +++ b/userguide/en/portal/hosts.md @@ -0,0 +1,23 @@ +## Hosts ## + +This webpage requires *admin* role to access. + + +The Hosts webpage shows the status of the Docklet physical cluster, +including host IP address, status, containers hosted, resource +usage, etc. + +The following show the information of a demo Docklet physical cluster: + +hosts + +Click the host's **Realtime** button, will get the configuration +information of the host: + +
+summary config of one host + +Also the host's realtime resource usage activity: + +
+realtime performance of one host diff --git a/userguide/en/portal/status.md b/userguide/en/portal/status.md new file mode 100644 index 0000000..3554286 --- /dev/null +++ b/userguide/en/portal/status.md @@ -0,0 +1,13 @@ +### Status ### + +The Status web page shows the vcluster monitoring information. Here users can see the load info of their vclusters, including CPU and memory activity. + +The following picture show the general load of the *tee* Workspace +vcluster: + +vcluster status + +Click **Realtime**, will get the realtime usage of the node's CPU and +memory, shown as follows: + +vcluster status summary of one node diff --git a/userguide/en/portal/users.md b/userguide/en/portal/users.md new file mode 100644 index 0000000..0c6364e --- /dev/null +++ b/userguide/en/portal/users.md @@ -0,0 +1,24 @@ +## Users ## + +This webpage requires *admin* role to access. +Through it administators can get and modify users' information, +and add user or usergroup at the same time. + + + +### Register & Activating ### +Now docklet account cannot be registered by users, it uses external authenticating system like pam to let users log in. + +Docklet account need activating to use some specific functions.When an external account is used at the first time, a notification will be shown in the upper left corner of the webpage, which says you should hand in a request form. + +add workspace + +By clicking it, you can enter the activating page. Your E-mail, student/staff ID, department, real name and reason to use Docklet is required.After handing in the request, Docklet administrators will verify it in 3 days. + +Your browser will log out Docklet automatically, re-login is required to refresh user info.Before your request is verified, a notification will be shown in upper left corner.It can be closed by clicking. + +## 试用 ## +docklet支持用户以访客模式试用,在你的帐号审核通过之前,你可以通过点击主页的have a try按钮进入 + + +试用界面,在试用模式下,你可以点击Go按钮进入到一个workspace中。 diff --git a/userguide/en/software/README.md b/userguide/en/software/README.md new file mode 100644 index 0000000..ade90a1 --- /dev/null +++ b/userguide/en/software/README.md @@ -0,0 +1,10 @@ +# Development in WEB Terminal # + + * [Available Packages](list.md) + * [C/C++ Example](c.md) + * [Java Example](java.md) + * [MPI Example](mpi.md) + * [Spark Example](spark.md) + * [LaTeX Example](latex.md) + + diff --git a/userguide/en/software/c.md b/userguide/en/software/c.md new file mode 100644 index 0000000..bbc5fbe --- /dev/null +++ b/userguide/en/software/c.md @@ -0,0 +1,171 @@ +## C/C++ Example## + +Docklet provides GCC 5+ and GDB to help develop C program. + +### C + +Below is the source code of file `sum.c`: + +```c +#include +#define TOP 50 + +int main() +{ + int i = 0 , sum = 0 ; + while ( i <= TOP ) { + sum += i ; + i++ ; + } + printf(" Sum of 1 to %d is %d\n", TOP, sum ) ; + return 0 ; +} +``` + +Open WEB Terminal, compile and run + +``` +$ gcc -g sum.c -o sum +$ ./sum + Sum of 1 to 50 is 1275 +``` + +### C++ ### + +The following is a C++ program with bugs from [here](http://www.cprogramming.com/gdb.html). + +The buggy `main.cpp`: + +```c++ +#include + +using namespace std; + +long factorial(int n); + +int main() +{ + int n(0); + cin>>n; + long val=factorial(n); + cout<... +8. Reading symbols from /root/main...done. +9. (gdb) break 11 +10. Breakpoint 1 at 0x80485f9: file main.cpp, line 11. +11. (gdb) run +12. Starting program: /root/main +13. 3 +14. +15. Breakpoint 1, main () at main.cpp:11 +16. 11 long val=factorial(n); +17. (gdb) step +18. factorial (n=3) at main.cpp:19 +19. 19 long result(1); +20. (gdb) list +21. 14 return 0; +22. 15 } +23. 16 +24. 17 long factorial(int n) +25. 18 { +26. 19 long result(1); +27. 20 while(n--) +28. 21 { +29. 22 result*=n; +30. 23 } +31. (gdb) watch n +32. Hardware watchpoint 2: n +33. (gdb) watch result +34. Hardware watchpoint 3: result +35. (gdb) continue +36. Continuing. +37. Hardware watchpoint 3: result +38. +39. Old value = 0 +40. New value = 1 +``` + +Var `result` is initialized to 1. + +``` +41. factorial (n=3) at main.cpp:20 +42. 20 while(n--) +43. (gdb) +``` + +Note that no commands are put here, just is hit. +It re-executes the last command. + +``` +44. Continuing. +45. Hardware watchpoint 2: n +46. +47. Old value = 3 +48. New value = 2 +``` + +Now `n` is immediately decremented from 3 to 2. + +``` +49. 0x08048654 in factorial (n=2) at main.cpp:20 +50. 20 while(n--) +51. (gdb) +52. Continuing. +53. Hardware watchpoint 3: result +54. +55. Old value = 1 +56. New value = 2 +``` + +Now result becomes 2 (by multiplying result's earlier value with n's +value). We've found the first bug! result is supposed to be evaluated by +multiplying 3 * 2 * 1 but here the multiplication starts from 2. To +correct it, we have to change the loop a bit. + +```c++ +while(n>0) //doesn't let n reach 0 +{ + result*=n; + n--; //decrements only after the evaluation +} +``` + +After applying the fix, the test is passed. + +[Here](http://www.cprogramming.com/gdb.html) describes the full +debugging process. + +For how to use GDB, please reference the official +[GDB manual](http://sourceware.org/gdb/current/onlinedocs/gdb/)。 diff --git a/userguide/en/software/java.md b/userguide/en/software/java.md new file mode 100644 index 0000000..5ccbc75 --- /dev/null +++ b/userguide/en/software/java.md @@ -0,0 +1,33 @@ +## Java Example ## + +Docklet provides OpenJDK 7+ to develop Java program. + +`NumberFactorial.java` is to calculate factorial of `n`: + +```java +public class NumberFactorial { + public static void main(String[] args) { + int number = 5; + /* + * Factorial of any number is !n. + * For example, factorial of 4 is 4*3*2*1. + */ + + int factorial = number; + + for(int i =(number - 1); i > 1; i--) { + factorial = factorial * i; + } + + System.out.println("Factorial of a number is " + factorial); + } +} +``` + +Open WEB Terminal, compile and run + +``` +$ javac NumberFactorial.java +$ java NumberFactorial +Factorial of a number is 120 +``` diff --git a/userguide/en/software/latex.md b/userguide/en/software/latex.md new file mode 100644 index 0000000..91b9bc0 --- /dev/null +++ b/userguide/en/software/latex.md @@ -0,0 +1,37 @@ +## LaTeX Example ## + +Docklet provides the full [TeX Live 2015](http://www.tug.org/texlive). +Users can edit and compile LaTeX source files, generating pdf files in +WEB Terminal. Then open the resulting pdf file in Jupyter Dashboard to +check the output. + +### Hello World + +A very simple `hello.tex` : + +```latex +\documentclass{article} +\title{My \LaTeX ~Hello World} +\author{Zhang San} +\date{Feb 20, 2016} +\begin{document} +\maketitle +Hello World! +\end{document} +``` + +Compile using `pdflatex` or `xelatex`: + +``` +$ xelatex hello +$ ls +hello.aux hello.log hello.pdf hello.tex +``` + +### CJK Char Support ### + +In TeX Live 2015, `xeCJK` + `xelatex` provides very good support for CJK and any other +non-ASCII chars. For CJK users, they only need to use the `xeCJK` package in their source file, +and then comile using `xelatex`, which will automatially use the +available truetype fonts in the system path, usually `/usr/share/fonts`. No extra work needed. + diff --git a/userguide/en/software/list.md b/userguide/en/software/list.md new file mode 100644 index 0000000..c566aeb --- /dev/null +++ b/userguide/en/software/list.md @@ -0,0 +1,38 @@ +## Software Packages ## + +The Docklet public base image has the following software: + +** System ** +* vim emcas nano +* tmux w3m curl +* zsh +* unrar,p7zip-full,p7zip-rar +* openssh-server +* openbox +* vpnc ( web ) +* fonts-droid, fonts-wqy-zenhei, fonts-wqy-microhei, fonts-arphic-gbsn00lp, fonts-arphic-gkai00mp, fonts-arphic-ukai, fonts-arphic-uming latex-cjk-all + +** Language ** +* gcc,g++,gdb,make,autoconf, gfortran, scons +* git, subversion +* openjdk-7-jdk +* python3, pip, virtualenv,pep8,pyflakes +* scala +* haskell-platform +* ruby-full +* clisp, guile +* golang +* nodejs, npm, node-legacy +* r-base, r-recommended, r-cran-\* +* texlive-full +* pandoc,xsltproc,markdown + +** Tools ** +* scipy, numpy, matpotlib, pandas,sympy, plotly, statsmodels, scikit-learn, nltk(with data), tensorflow +* jupyter notebook +* spark +* mpich2 +* storm +* elasticsearch +* kafka/zmq +* couchdb/mongodb/memcached/redis-server diff --git a/userguide/en/software/mpi.md b/userguide/en/software/mpi.md new file mode 100644 index 0000000..8ef93ad --- /dev/null +++ b/userguide/en/software/mpi.md @@ -0,0 +1,58 @@ +## MPI Example ## + +Docklet provides [MPICH2](http://www.mpich.org) to develop MPI program. + +Assume the user's vcluster has two nodes: `host-0` and `host-1`. + +Open WEB Terminal to check `/etc/hosts` file + +``` +$ cat /etc/hosts +127.0.0.1 localhost +172.16.0.46 host-0 host-0.tee +172.16.0.45 host-1 host-1.tee +``` + +We can get the node list by the following command: + +``` +$ cat /etc/hosts | grep -v localhost | awk '{print $2}' +host-0 +host-1 +``` + +Now the WEB Terminal is running on `host-0`, the first node in the +vcluster. Edit `mpihello.c` on `host-0` + +```c +#include "mpi.h" +#include +#include +#define MASTER 0 + +int main (int argc, char *argv[]) +{ + int numtasks, taskid, len; + char hostname[MPI_MAX_PROCESSOR_NAME]; + + MPI_Init(&argc, &argv); + MPI_Comm_size(MPI_COMM_WORLD, &numtasks); + MPI_Comm_rank(MPI_COMM_WORLD,&taskid); + MPI_Get_processor_name(hostname, &len); + printf ("Hello from task %d on %s!\n", taskid, hostname); + if (taskid == MASTER) + printf("MASTER: Number of MPI tasks is: %d\n",numtasks); + MPI_Finalize(); + return 0 ; +} +``` + +Open WEB Terminal and run: + +``` +$ mpicc mpihello.c -o mpihello +$ mpirun -n 2 -hosts host-0,host-1 mpihello +Hello from task 0 on host-0! +MASTER: Number of MPI tasks is: 2 +Hello from task 1 on host-1! +``` diff --git a/userguide/en/software/spark.md b/userguide/en/software/spark.md new file mode 100644 index 0000000..bda06fd --- /dev/null +++ b/userguide/en/software/spark.md @@ -0,0 +1 @@ +## Spark Example ## diff --git a/userguide/en/workspace/README.md b/userguide/en/workspace/README.md new file mode 100644 index 0000000..acefd93 --- /dev/null +++ b/userguide/en/workspace/README.md @@ -0,0 +1,20 @@ +# Workspace # + +Docklet Workspace is provided by the integrated open source component of +[Jupyter Notebook](https://github.com/jupyter/notebook). +There are many documents about Jupyter Notebook. Users are recommended +to visit the latest official document +[The Jupyter notebook](http://jupyter-notebook.readthedocs.org/en/latest/index.html) for help, where the [UI Components](http://jupyter-notebook.readthedocs.org/en/latest/ui_components.html) describes Jupyter Notebook user interface. + +The Jupyter Notebook dashboard is the entrance of all operations. A +demo snapshot is depicted as follows: + + + +This section mainly focuses on the following Workspace functions: + +* [Data Management](data.md) +* [File Editing](edit.md) +* [WEB Terminal](terminal.md) +* [Interactive Programming](notebook.md) diff --git a/userguide/en/workspace/data.md b/userguide/en/workspace/data.md new file mode 100644 index 0000000..95e75f2 --- /dev/null +++ b/userguide/en/workspace/data.md @@ -0,0 +1,12 @@ +## Data Management ## + +When users entering their Workspace of Jupyter Notebook, they will see +the tree of filesa and directories. Users can perform **Rename** and +**Delete** operating while selecting some files. + +Click **New** - **Folder** can create a folder. Click **New** - ** Text File +** can create a text file with name *untitled.txt*,and enter the [File Editing](edit.md) webpage. + +Click **Upload**, users can upload their data files to the Workspace. + +**Note**: The `~/nfs` directory is the [Dataspace](../concepts/nfs.md). The files in Dataspace will be shared among all containers in the vcluster, and are persistent. Files in other directory are all transient, will be destroyed when the container is deleted. diff --git a/userguide/en/workspace/edit.md b/userguide/en/workspace/edit.md new file mode 100644 index 0000000..fc2fcd1 --- /dev/null +++ b/userguide/en/workspace/edit.md @@ -0,0 +1,15 @@ +## File Editing ## + +In the Jupyter Notebook dashboard, click text files such as .c .py .md .txt and .log files, will enter the file editing webpage. + +Users can rename file, e.g., change *untitled.txt* to *hello.c*. + +Those familiar with vi or emacs can choose to use **Vim** or **emacs** +mode in the **Edit** menu. The default edit mode is **Sublime Text**. + +The text editor supports tens of programming languages, can +automatically hilight key words, auto indent, etc. + +**Note**: The Jupyter Notebook editor now only supports editing small +files. For large files, the click of edition may cause the whole system +** Hangup **。 diff --git a/userguide/en/workspace/notebook.md b/userguide/en/workspace/notebook.md new file mode 100644 index 0000000..bbfabe3 --- /dev/null +++ b/userguide/en/workspace/notebook.md @@ -0,0 +1,22 @@ +## Interactive Programming ## + +Jupyter Notebook has a very attractive feature of **interactive +programming** in browser. It provides a web-based application suitable +for capturing the whole computation process: developing, documenting, +and executing code, as well as communicating the results. + +Main features including but not limited to: +* The ability to execute code from the browser, with the results of +computations attached to the code which generated them. +* Displaying the result of computation using rich media representations, +such as HTML, LaTeX, PNG, SVG, etc. For example, publication-quality +figures rendered by the matplotlib library, can be included inline. +* In-browser editing for rich text using the Markdown markup language, +which can provide commentary for the code, is not limited to plain text. +* The ability to easily include mathematical notation within markdown +cells using LaTeX, and rendered natively by MathJax. + +Please refer to [The Jupyter Notebook](http://jupyter-notebook.readthedocs.org/en/latest/notebook.html) for more info. + +The [Development Demo](../notebook/README.md) section gives some help about programming in +Notebook, mainly programming using Python and R packages. diff --git a/userguide/en/workspace/terminal.md b/userguide/en/workspace/terminal.md new file mode 100644 index 0000000..ee277d2 --- /dev/null +++ b/userguide/en/workspace/terminal.md @@ -0,0 +1,48 @@ +## WEB Terminal ## + +To operate in terminal is necessary for many developers. Almost all +system administrator, programming, debugging and analyzing jobs can be +performed in terminal environment. + +In the Jupyter Notebook dashboard, click **New** - **Terminal** will create a new WEB Terminal and then enter the terminal, where users can do their stuffs like editing a file using vi, running a program, etc. + +The Jupyter Notebook WEB Terminal has one important feature: running in +the background, even users has closed the WEB Terminal webpage. The +user can find all their live Terminals by clicking **Running** - +**Terminals** in the dashboard. They can re-enter the Terminal by +clicking the name, to recover their work. This feature is very important +for long running jobs. + +If the user re-enter the Terminal after a long period of idle time, the Terminal +may show no response to user input. Usually refresh the page will get it +back. + +If the user will perform multiple tasks, they can open several +Terminals, or using [tmux](https://tmux.github.io) in one Terminal. + +**Note**: +Sometimes, in Chrome, the fonts of terminal may be not monospaced. +This may cause the width of terminal not fixed and commands' output not aligned. +Here is the way to fix it : Chrome -> Settings -> Show Advanced Settings -> +Custom Fonts, choose a monospaced font for **Monospaced Font** + +### Install Software ### + +Users can install software packages not in the base image. The Docklet +container is based on Ubuntu. The command `apt-get` is used for package +installation. + +Example: + +``` +$ apt-get install clang +``` + +Users are encouraged to clean the `apt-get` cache to save disk space + +``` +$ apt-get clean +``` + +About `apt-get`, please refer to +the [official help](https://help.ubuntu.com/community/AptGet/Howto). diff --git a/userguide/zh/README.md b/userguide/zh/README.md new file mode 100644 index 0000000..2d837ca --- /dev/null +++ b/userguide/zh/README.md @@ -0,0 +1,10 @@ +Docklet 用户手册 +=============== + +v0.2 + +2016年2月20日 + +© 北京大学软工所 + +http://www.unias.org/trac/docklet diff --git a/userguide/zh/SUMMARY.md b/userguide/zh/SUMMARY.md new file mode 100644 index 0000000..3bc0f32 --- /dev/null +++ b/userguide/zh/SUMMARY.md @@ -0,0 +1,36 @@ +# Summary + +* [Docklet 简介](intro/README.md) + * [什么是 Docklet](intro/what.md) + * [为什么要用 Docklet](intro/why.md) + * [如何使用 Docklet](intro/how.md) +* [基本概念](concepts/README.md) + * [工作区](concepts/workspace.md) + * [虚拟集群](concepts/vcluster.md) + * [镜像](concepts/image.md) + * [服务](concepts/service.md) + * [数据区](concepts/nfs.md) +* [门户系统](portal/README.md) + * [面板](portal/dashboard.md) + * [配置](portal/config.md) + * [状态](portal/status.md) + * [主机](portal/hosts.md) + * [用户](portal/users.md) + * [管理](portal/admin.md) +* [工作区](workspace/README.md) + * [数据管理](workspace/data.md) + * [文件编辑](workspace/edit.md) + * [终端操作](workspace/terminal.md) + * [交互编程](workspace/notebook.md) +* [在终端中开发](software/README.md) + * [安装的软件包列表](software/list.md) + * [C/C++](software/c.md) + * [Java](software/java.md) + * [MPI](software/mpi.md) + * [Spark](software/spark.md) + * [LaTeX](software/latex.md) +* [科学计算与数据处理](notebook/README.md) + * [Python 软件包](notebook/python.md) + * [R 软件包](notebook/r.md) + * [使用Notebook](notebook/nb.md) + * [各种样例](notebook/gallery.md) diff --git a/userguide/zh/concepts/README.md b/userguide/zh/concepts/README.md new file mode 100644 index 0000000..fde432f --- /dev/null +++ b/userguide/zh/concepts/README.md @@ -0,0 +1,8 @@ +# 基本概念 # + +Docklet包括如下重要基本概念: +* [工作区](workspace.md) +* [虚拟集群](vcluster.md) +* [镜像](image.md) +* [服务](service.md) +* [数据区](nfs.md) diff --git a/userguide/zh/concepts/image.md b/userguide/zh/concepts/image.md new file mode 100644 index 0000000..3bb4f2d --- /dev/null +++ b/userguide/zh/concepts/image.md @@ -0,0 +1,19 @@ +## 镜像 ## + +Docklet的镜像(image)和[Docker](https://github.com/docker/docker) 镜像非常类似, +是一个只读的模板,用来创建容器。但两者又稍有不同。 + +Docker由于主要关注单机虚拟化,其镜像非常自由,用户几乎可以任意定制自己的 +安装镜像。通常,某个Docker镜像只包含某个特定应用框架。 +例如:一个Docker镜像可以包含一个完整的 ubuntu 操作系统环境,里面仅安装了 +Apache 或用户需要的其它应用程序。 + +Docklet由于关注**工作区**和**虚拟集群**,因此对镜像有特定要求, +不支持用户自由定制上传镜像。Docklet管理员会根据所在单位需求, +定制一个或多个公共基础镜像, +里面包含了支持工作区和集群工作必须的软件及配置,以及若干开发语言和计算框架, +如MPI、Spark、Python、R等。由于Docklet公共基础镜像是多用户共享的, +因此通常比Docker镜像要大很多。 + +如果公共基础镜像里没有用户需要的软件,用户可以在容器中安装配置好软件后, +保存自己的工作区,生成自己的私有镜像,并可以发布共享。 diff --git a/userguide/zh/concepts/nfs.md b/userguide/zh/concepts/nfs.md new file mode 100644 index 0000000..982afbc --- /dev/null +++ b/userguide/zh/concepts/nfs.md @@ -0,0 +1,13 @@ +## 数据区 ## + +在Docklet中,每个用户拥有一个持久存储的数据区(Dataspace), +用于存放自己的数据文件。数据区是隔离的,每个用户只能访问自己的数据区。 + +数据区在虚拟集群的各个容器节点之间是共享的,位于容器节点的 `~/nfs` 目录下。 + +用户保存自己的工作区为一个镜像的时候,会排除数据区。 + +数据区由分布式文件系统支撑。除了数据区,容器中保存的其他数据都会随着容器 +的销毁而消失,但数据区中的数据会持久存储下去。 + +建议用户通过工作区,自己及时下载和备份重要的数据。 diff --git a/userguide/zh/concepts/service.md b/userguide/zh/concepts/service.md new file mode 100644 index 0000000..5138276 --- /dev/null +++ b/userguide/zh/concepts/service.md @@ -0,0 +1,9 @@ +## 服务 ## + +Docklet镜像中的很多软件工具可以直接通过脚本运行, +也有一些软件框架以服务(Service)的形式运行, +如[Spark](http://spark.apache.org)、MPI 等。 +在用户创建并启动一个容器的时候,这些框架缺省是不启动的,以降低资源消耗。 +用户可以对Docklet虚拟集群进行配置,选择要启动的服务。 + +Docklet也支持用户自定义启动服务。 diff --git a/userguide/zh/concepts/vcluster.md b/userguide/zh/concepts/vcluster.md new file mode 100644 index 0000000..97b838e --- /dev/null +++ b/userguide/zh/concepts/vcluster.md @@ -0,0 +1,20 @@ +## 虚拟集群 ## + +虚拟集群(vcluster: virtual cluster)是Docklet的基础概念, +支撑工作区(Workspace)的运行。虚拟集群由若干 LXC 容器组成, +每个容器可被视为一台轻量的Linux主机。通常,一个集群中的容器是同构的, +即通过同一个镜像生成的,这可以简化集群的管理。 + +一个虚拟集群能拥有的节点数上限、每个节点的CPU和内存配额取决于用户的级别。 +常见的缺省配置为只有1个初始节点。 + +Docklet虚拟集群的各个节点被设计于尽可能分布在不同的物理主机上, +以尽量利用物理资源。这些分布节点彼此可以通信。 +一个虚拟集群的所有节点共享一个虚拟子网,通常第一个节点是该子网的网关。 + +在大多数情况下,用户不需要关注虚拟集群的细节,只需要关注工作区(Workspace)。 +在有些场景下,用户需要对虚拟集群进行配置,以支持工作区更好运行。 +例如,为了让工作区的运行的代码可以并行执行,需要扩充缺省虚拟集群, +增加新节点。Docklet虚拟集群具有弹性伸缩能力, +支持随时为虚拟集群增加或者删除容器节点。 + diff --git a/userguide/zh/concepts/workspace.md b/userguide/zh/concepts/workspace.md new file mode 100644 index 0000000..f604b2d --- /dev/null +++ b/userguide/zh/concepts/workspace.md @@ -0,0 +1,18 @@ +## 工作区 ## + +工作区(Workspace)是Docklet的核心概念,它是一个基于WEB的交互式工作环境。 +Docklet工作区是通过[Jupyter Notebook](https://jupyter.org)实现的, +用户能够创建并分享包含可即刻运行的代码、公式、视频和解释性文本的文档, +可进行数据清洗和转换、数字仿真、概率统计、机器学习、数据分析等工作。 + +对大多数用户而言,他们不必关心底层的机器、集群、容器、虚拟集群的细节, +他们只需要打开浏览器,进入工作区,就可以交互式地完成大部分的工作, +包括结果的可视化。目前WEB工作区支持40多种编程语言。 + +下图是[Jupyter Notebook](https://jupyter.org)工作区的一个截图: + +jupyter workspace + +对于高级用户,工作区还提供了WEB Terminal,可进行非可视化的所有其他工作。 + + diff --git a/userguide/zh/images/config-image.png b/userguide/zh/images/config-image.png new file mode 100644 index 0000000..61499a2 Binary files /dev/null and b/userguide/zh/images/config-image.png differ diff --git a/userguide/zh/images/config-workspace.png b/userguide/zh/images/config-workspace.png new file mode 100644 index 0000000..5de53f5 Binary files /dev/null and b/userguide/zh/images/config-workspace.png differ diff --git a/userguide/zh/images/dashboard-start.png b/userguide/zh/images/dashboard-start.png new file mode 100644 index 0000000..68f8549 Binary files /dev/null and b/userguide/zh/images/dashboard-start.png differ diff --git a/userguide/zh/images/dashboard-stop.png b/userguide/zh/images/dashboard-stop.png new file mode 100644 index 0000000..0488400 Binary files /dev/null and b/userguide/zh/images/dashboard-stop.png differ diff --git a/userguide/zh/images/docklet-architecture.jpg b/userguide/zh/images/docklet-architecture.jpg new file mode 100644 index 0000000..54d6476 Binary files /dev/null and b/userguide/zh/images/docklet-architecture.jpg differ diff --git a/userguide/zh/images/hosts-summary-config.png b/userguide/zh/images/hosts-summary-config.png new file mode 100644 index 0000000..706334d Binary files /dev/null and b/userguide/zh/images/hosts-summary-config.png differ diff --git a/userguide/zh/images/hosts-summary-perf.png b/userguide/zh/images/hosts-summary-perf.png new file mode 100644 index 0000000..b47a55e Binary files /dev/null and b/userguide/zh/images/hosts-summary-perf.png differ diff --git a/userguide/zh/images/hosts.png b/userguide/zh/images/hosts.png new file mode 100644 index 0000000..3bc6eb5 Binary files /dev/null and b/userguide/zh/images/hosts.png differ diff --git a/userguide/zh/images/jupyter-notebook-dashboard.png b/userguide/zh/images/jupyter-notebook-dashboard.png new file mode 100644 index 0000000..dee4271 Binary files /dev/null and b/userguide/zh/images/jupyter-notebook-dashboard.png differ diff --git a/userguide/zh/images/jupyterpreview.png b/userguide/zh/images/jupyterpreview.png new file mode 100644 index 0000000..e9588ca Binary files /dev/null and b/userguide/zh/images/jupyterpreview.png differ diff --git a/userguide/zh/images/sklearn1.png b/userguide/zh/images/sklearn1.png new file mode 100644 index 0000000..83b1ab0 Binary files /dev/null and b/userguide/zh/images/sklearn1.png differ diff --git a/userguide/zh/images/status-summary.png b/userguide/zh/images/status-summary.png new file mode 100644 index 0000000..2681e66 Binary files /dev/null and b/userguide/zh/images/status-summary.png differ diff --git a/userguide/zh/images/status.png b/userguide/zh/images/status.png new file mode 100644 index 0000000..2ba4f8c Binary files /dev/null and b/userguide/zh/images/status.png differ diff --git a/userguide/zh/images/user_init.png b/userguide/zh/images/user_init.png new file mode 100644 index 0000000..44139ab Binary files /dev/null and b/userguide/zh/images/user_init.png differ diff --git a/userguide/zh/images/workspace-create.png b/userguide/zh/images/workspace-create.png new file mode 100644 index 0000000..519fd2b Binary files /dev/null and b/userguide/zh/images/workspace-create.png differ diff --git a/userguide/zh/intro/README.md b/userguide/zh/intro/README.md new file mode 100644 index 0000000..677192f --- /dev/null +++ b/userguide/zh/intro/README.md @@ -0,0 +1,14 @@ +# Docklet 简介 # + +欢迎来到Docklet的世界! + +什么是Docklet? + +它和大名鼎鼎的Docker有什么关系? + +用它会带来什么好处? + +如何使用Docklet? + +本章将首先为大家解答上述疑惑。 + diff --git a/userguide/zh/intro/how.md b/userguide/zh/intro/how.md new file mode 100644 index 0000000..138f09b --- /dev/null +++ b/userguide/zh/intro/how.md @@ -0,0 +1,11 @@ +## 如何使用Docklet ## + +对于Docklet用户而言,只需一个现代浏览器,如Firefox,Chrome或Safari, +即可访问和使用。 + +Docklet管理员会配置用户登录方式。例如,部署在北京大学计算中心的 +http://iwork.internetware.org 支持北大帐号登录和内部帐号登录两种方式。 + +Docklet也支持 Guest 帐号。用户可以用Guest账户体验“工作环境”。 +Guest的工作环境会被定期初始化。 + diff --git a/userguide/zh/intro/what.md b/userguide/zh/intro/what.md new file mode 100644 index 0000000..ba2e7b3 --- /dev/null +++ b/userguide/zh/intro/what.md @@ -0,0 +1,49 @@ +## 什么是 Docklet ## + +### 定义与结构 ### + +Docklet 是北京大学软件工程研究所的一个开源项目,遵从 [新BSD 协议](http://directory.fsf.org/wiki/License:BSD_3Clause),项目代码在 GitHub +上进行维护。 + +Docklet的目标是实现一种**云化的个人工作区**解决方案, +使企业可以轻松地虚拟化其数据中心,为用户创建个人的虚拟集群, +进而为用户提供一个在**云端**的可定制工作区(Workspace)。 +用户只需要一个浏览器,即可随时随地访问企业内部自己的工作区, +在线进行代码编写、调试运行、数据管理、数据分析、结果可视化等工作。 + +Docklet的核心是基于Linux容器(LXC)技术和软件定义网络技术的容器集群。 +它在LXC的基础上进行了进一步的封装,让用户不仅不需要关心单个容器的管理细节, +更无须关注各个分布容器之间的网络通信。 +这样一个容器集群提供了一种物理集群虚拟化方案, +支持安装运行绝大多数已有的单机或分布式Linux应用,形成个人的开发工作区。 +用户通过Docklet集成的[Jupyper Notebook](https://github.com/jupyter/notebook) +访问自己的工作区,通过浏览器完成开发运行等工作。 + + +Docklet的结构如下图所示: + + + + + +### Docklet与Docker ### + +Docklet和[Docker](https://github.com/docker/docker) 没有直接关系, +它们是两个完全不同的项目,设计目标也完全不同。 + +Docker的目标是实现轻量级的操作系统虚拟化解决方案, +位于云计算的[IaaS层](https://en.wikipedia.org/wiki/Cloud_computing)。 +Docker在LXC的基础上进行了进一步的封装, +使得[用户操作 Docker 的容器就像操作一个快速轻量级的虚拟机一样简单 +](https://www.gitbook.com/book/yeasy/docker_practice)。 +虽然其后增加了网络集群能力,但本质上仍然是面向单机系统的, +位于软件栈的操作系统层。 + +Docklet的目标是实现一种开发者工作区的云化解决方案, +涵盖[SaaS、PaaS和IaaS层](https://en.wikipedia.org/wiki/Cloud_computing)。 +Docklet的基础是LXC容器集群而不是Docker。 +对Docklet用户而言,他们直接通过浏览器进行软件开发、调试 +测试工作,面向的是运行于虚拟集群中的各种开发工具,这些工具处于软件栈的应用层。 diff --git a/userguide/zh/intro/why.md b/userguide/zh/intro/why.md new file mode 100644 index 0000000..c28f71b --- /dev/null +++ b/userguide/zh/intro/why.md @@ -0,0 +1,79 @@ +## 为什么要用 Docklet ## + +作为一种“**工作区云**",Docklet有很多优点。 + +首先,Docklet 提供的是一个面向开发者的虚拟工作区,而不仅是IaaS或PaaS, +对用户更友好。其次,Docklet面向虚拟集群的工作方式对物理资源的利用效率很高, +使用简便,对管理员非常友好。 + +具体说来,Docklet 具有如下的优点和特色: + +### 适用广泛 ### + +Docklet并非是针对大型数据中心设计的,而是适合于微小型的数据中心, +即规模在百台物理机以下、由高速网络连接、节点具有较高性能的集群, +这是大多数中小企业和研究机构的常见配置。这使得Docklet拥有广泛的适用性。 + +### 多个计算框架共享集群资源 ### + +传统的MPI集群及最近的Spark集群等通常是专用的, +机器难以在不同的计算框架间共享。[Mesos](http://mesos.apache.org)和 +[Yarn](https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html)分别提供了一种共享集群资源的机制,但它们是通过为计算框架进行适配实现的, +需要为每个要支持的计算框架增加额外的编程工作。Docklet通过将集群虚拟化, +提供了一种更灵活易用的共享资源的方式, +各种计算框架无需修改就可以无缝运行在虚拟集群中,非常便利灵活。 + +### 动态弹性配置资源 ### + +Docklet的虚拟集群支持弹性配置资源。用户只需运行一个命令或者进行一个点击, +就可以为其虚拟集群**瞬间**增加一个和现有容器功能相同的容器节点。 +这对于Spark这样支持弹性资源使用方式的计算框架而言非常有用。 + +Docklet还能监测用户的活动,智能发现非工作用户,自动调整资源占用配额, +以取得更好的资源利用率和用户使用体验。 + +### 消耗的网络资源少 ### + +Docklet使用了软件定义网络技术,内部是一个虚拟的IP网络, +外部只需要配置**一个公共IP地址/域名**。用户无需其他软件, +只需要在浏览器中访问该公共IP地址/域名,就可以自动穿透单位防火墙, +访问运行在单位内部的私有虚拟集群。 + +### 使用简单 ### + +Docklet主要是面向单位内部的开发者用户的, +通过一个公共的基础镜像,为用户预置了该单位软件开发所需的常用软件资源。 +用户可以很容易只需一个浏览器就能进入工作区, +直接就可以进行编程、调试、运行工作, +避免了安装配置物理集群、安装配置软件系统、管理镜像等复杂繁琐的任务,使用简便。 +这对于非IT专业的用户而言,尤为重要。 + +### 适合做大数据等实验测试 ### + +Docklet使用简便,意味着Docklet可以是很理想的实验测试平台。 +例如用户可以针对一个较小的数据集(Gigabyte级别)进行大数据程序的测试调试 +,成功后再部署到生产系统运行。对于很多数据量不大的数据分析任务而言, +可以直接在Docklet上运行获得最终结果。 + +Docklet也是一个理想的学习平台, +内置的各种语言工具、计算框架可以帮助用户直接进行各种编程的练习与实验。 + +### 方便定制与扩展 ### + +Docklet的系统管理员可以定制特定于单位需求的公共基础镜像, +用户也可以自行安装配置所需的特定软件, +然后可以将自己的工作区保存为私有镜像并共享发布,支持其他用户使用。 +因此,具有良好的定制性与扩展性。 + +### 支持多用户 ### + +Docklet的虚拟集群天生支持多用户,每个用户拥有自己独立的私有虚拟集群, +每个集群可以安装运行独立的MPI、Spark计算框架而互不干扰, +从而可以满足不同用户不同的需求。 + +### LXC具有的优点 ### + +Docklet具有Docker这样基于LXC技术的虚拟化方案所天生具有的诸多优点, +如启动快,容器的启动可以在秒级实现;对系统资源的利用率很高, +一台主机上可以同时运行成百上千个容器;应用性能高,消耗的系统资源少; +交付快速,管理简单等。 diff --git a/userguide/zh/notebook/README.md b/userguide/zh/notebook/README.md new file mode 100644 index 0000000..d5b1447 --- /dev/null +++ b/userguide/zh/notebook/README.md @@ -0,0 +1,34 @@ +# 科学计算与数据处理 # + +Docklet 主要通过集成的 [Python](https://www.python.org) 和 [R](http://www.r-project.org) +语言软件包支持进行科学计算与数据处理。大部分工作都可以在Jupyter Notebook中交互完成, +并将结果以可视化形式在浏览器中显示。 + +互联网上有很多关于如何使用 Python 和 R 进行大数据处理、科学计算、机器学习、 +自然语言处理、统计分析、数据挖掘、图计算、数据可视化等任务的文章, +主要是对各种软件包和工具的介绍,这对于初学用户而言非常有价值。 + +[CONTINUUM](https://www.continuum.io) 的 +[ANACONDA](https://www.continuum.io/why-anaconda)是一个计算分析平台, +分门别类集成了很多开源 Python 软件包,以及部分 R 软件包。 + +[ENTHOUGHT](https://www.enthought.com) 的 +[Canopy](https://www.enthought.com/products/canopy/)是一个和CONTINUUM类似的系统, +集成了很多开源的 Python软件包。 + +[RStudio](https://www.rstudio.com) 有 [RStudio +IDE](https://www.rstudio.com/products/rstudio2/) 和 +[R 软件包](https://www.rstudio.com/products/rpackages/) 的介绍。 + +很多和可视化、IDE有关的工具需要在本地操作系统安装运行。Docklet 中的 +Python 软件包和 R 软件包大部分可以在 Jupyter Notebook 中运行。用户也可以 +通过 `pip3` 等工具自行安装缺失的软件包。 + +Docklet的Python 和 R 软件包列表如下: + + * [Python 软件包](python.md) + * [R 软件包](r.md) + + +如何在 Jupyter Notebook 中进行科学计算等活动的教程和示例请见 [这里 +](https://github.com/ipython/ipython/wiki/A-gallery-of-interesting-IPython-Notebooks) 和 [这里](http://nbviewer.jupyter.org) 。 diff --git a/userguide/zh/notebook/gallery.md b/userguide/zh/notebook/gallery.md new file mode 100644 index 0000000..70288aa --- /dev/null +++ b/userguide/zh/notebook/gallery.md @@ -0,0 +1,587 @@ +## 一些有序的 Notebooks 示例 ## + +https://github.com/ipython/ipython/wiki/A-gallery-of-interesting-IPython-Notebooks 收集了一些用 Notebook 进行各种工作的样例。Docklet 预安装了常用的 [Python3](python.md) 和 +[R](r.md) 软件包,如果某些软件包没有安装,请用户通过 `pip3` 等命令自行安装。 + + +1. [Entire books or other large collections of notebooks on a topic](#entire-books-or-other-large-collections-of-notebooks-on-a-topic) + * [Introductory Tutorials](#introductory-tutorials) + * [Programming and Computer Science](#programming-and-computer-science) + * [Statistics, Machine Learning and Data Science](#statistics-machine-learning-and-data-science) + * [Mathematics, Physics, Chemistry, Biology](#mathematics-physics-chemistry-biology) + * [Earth Science and Geo-Spatial data](#earth-science-and-geo-spatial-data) + * [Linguistics and Text Mining](#linguistics-and-text-mining) + * [Signal Processing](#signal-processing) + * [Engineering Education](#engineering-education) +1. [Scientific computing and data analysis with the SciPy Stack](#scientific-computing-and-data-analysis-with-the-scipy-stack) + * [General topics in scientific computing](#general-topics-in-scientific-computing) + * [Social data](#social-data) + * [Psychology and Neuroscience](#psychology-and-neuroscience) + * [Machine Learning, Statistics and Probability](#machine-learning-statistics-and-probability) + * [Physics, Chemistry and Biology](#physics-chemistry-and-biology) + * [Economics and Finance](#economics-and-finance) + * [Earth science and geo-spatial data](#earth-science-and-geo-spatial-data) + * [Data visualization and plotting](#data-visualization-and-plotting) + * [Mathematics](#mathematics) + * [Signal and Sound Processing](#signal-and-sound-processing) + * [Natural Language Processing](#natural-language-processing) + * [Pandas for data analysis](#pandas-for-data-analysis) +1. [General Python Programming](#general-python-programming) +1. [Notebooks in languages other than Python](#notebooks-in-languages-other-than-python) + * [Julia](#julia) + * [Haskell](#haskell) + * [Ruby](#ruby) + * [Perl](#perl) +1. [Miscellaneous topics about doing various things *with* the Notebook itself](#miscellaneous-topics-about-doing-various-things-with-the-notebook-itself) +1. [Reproducible academic publications](#reproducible-academic-publications) +1. [Other publications using the Notebook](#other-publications-and-conference-abstracts-that-explicitly-use-the-notebook) +1. [Data-driven journalism](#data-driven-journalism) +1. [Whimsical notebooks](#whimsical-notebooks) +1. [Videos of IPython being used in the wild](#videos-of-ipython-being-used-in-the-wild) + +## Entire books or other large collections of notebooks on a topic + +### Introductory Tutorials + +* First things first, how to [run code in the notebook](https://nbviewer.jupyter.org/github/jupyter/notebook/blob/master/docs/source/examples/Notebook/Running%20Code.ipynb). There is also a general [collection of notebooks](https://nbviewer.jupyter.org/github/ipython/ipython/blob/master/examples/IPython%20Kernel/Index.ipynb) from IPython. Another useful one from this collection is an explanation of our [rich display system](https://nbviewer.jupyter.org/github/ipython/ipython/blob/master/examples/IPython%20Kernel/Custom%20Display%20Logic.ipynb). + +* A [great matplotlib tutorial](http://nbviewer.ipython.org/github/jrjohansson/scientific-python-lectures/blob/master/Lecture-4-Matplotlib.ipynb), part of the fantastic [Lectures on Scientific Computing with Python](http://nbviewer.ipython.org/github/jrjohansson/scientific-python-lectures/tree/master) by [J.R. Johansson](https://github.com/jrjohansson). + +* The code of the [IPython mini-book](https://github.com/rossant/ipython-minibook) by C. Rossant, introducing IPython, NumPy, SciPy, Pandas and matplotlib for interactive computing and data visualization. + +* [Python Tutorial](https://github.com/rajathkumarmp/Python-Lectures) by [Rajath Kumar M P](https://github.com/rajathkumarmp) + +### Programming and Computer Science + +* [Introduction to Programming (using Python)](http://nbviewer.ipython.org/github/ehmatthes/intro_programming/blob/master/notebooks/index.ipynb), an entire introductory Python course written by [Eric Matthes](http://peak5390.wordpress.com/about). [This post](http://peak5390.wordpress.com/2013/09/22/how-ipython-notebook-and-github-have-changed-the-way-i-teach-python/) explains the educational context in an Alaskan high school where Eric is a teacher. + +* [Python for Developers](http://ricardoduarte.github.io/python-for-developers), a complete book on Python programming by [Ricardo Duarte](https://github.com/ricardoduarte). Note the book also exists [in Portuguese](http://ricardoduarte.github.io/python-para-desenvolvedores). + +* [CS1001.py - Extended Introduction to Computer Science](https://github.com/yoavram/CS1001.py). Recitations from Tel-Aviv University introductory course to computer science, assembled as IPython notebooks by [Yoav Ram](http://www.yoavram.com/). + +* [Exploratory Computing with Python](http://mbakker7.github.io/exploratory_computing_with_python/), a set of 15 Notebooks that cover exploratory computing, data analysis, and visualization. No prior programming knowledge required. Each Notebook includes a number of exercises (with answers) that should take less than 4 hours to complete. Developed by Mark Bakker for undergraduate engineering students at the Delft University of Technology. + +* [Understanding evolutionary strategies and covariance matrix adaptation](http://nbviewer.ipython.org/github/lmarti/evolutionary-computation-course/blob/master/AEC.04%20-%20Evolutionary%20Strategies%20and%20Covariance%20Matrix%20Adaptation.ipynb), from the [Advanced Evolutionary Computation: Theory and Practice](http://nbviewer.ipython.org/github/lmarti/evolutionary-computation-course/tree/master/) course by [Luis Martí](http://lmarti.com). + +### Statistics, Machine Learning and Data Science + +* [AM207: Monte Carlo Methods, Stochastic Optimization](http://am207.org): a complete course by Verena Kaynig-Fittkau and Pavlos Protopapas from Harvard, with all lecture materials and homework sets as notebooks. + +* [An introduction to Bayesian inference](http://nbviewer.ipython.org/github/CamDavidsonPilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers/blob/master/Chapter1_Introduction/Chapter1.ipynb), this is just chapter 1 in an ongoing book titled [Probabilistic Programming and Bayesian Methods for Hackers Using Python and PyMC](http://camdavidsonpilon.github.io/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers), by [Cameron Davidson-Pilon](http://camdp.com/). + +* [Learn Data Science](http://learnds.com), an entire self-directed course by [Nitin Borwankar](https://github.com/nborwankar). + +* [IPython Cookbook](http://ipython-books.github.io/cookbook/) by [Cyrille Rossant](http://cyrille.rossant.net/), a comprehensive guide to Python for Data Science. The code of the 100 recipes is available on [the GitHub repository](https://github.com/ipython-books/cookbook-code). + +* [An introduction to machine learning with Python and scikit-learn](http://nbviewer.ipython.org/github/temporaer/tutorial_ml_gkbionics/blob/master/2%20-%20KMeans.ipynb) ([repo and overview](https://github.com/amueller/tutorial_ml_gkbionics)) by [Hannes Schulz](https://github.com/temporaer) and [Andreas Mueller](https://github.com/amueller). + +* [Clustering and Regression](http://nbviewer.ipython.org/github/amplab/datascience-sp14/blob/master/hw2/HW2.ipynb), part of the UC Berkeley 2014 [Introduction to Data Science](http://amplab.github.io/datascience-sp14/) course taught by [Michael Franklin](http://www.cs.berkeley.edu/~franklin/). + +* [Neural Networks](http://nbviewer.ipython.org/github/masinoa/machine_learning/blob/master/04_Neural_Networks.ipynb), part of a collection on [machine learning](https://github.com/masinoa/machine_learning) by [Aaron Masino](https://github.com/masinoa). + +* [An introduction to Pandas](http://nbviewer.ipython.org/urls/bitbucket.org/hrojas/learn-pandas/raw/master/lessons/01%20-%20Lesson.ipynb), part of an [11-lesson tutorial on Pandas](https://bitbucket.org/hrojas/learn-pandas), by [Hernán Rojas](https://bitbucket.org/hrojas). + +* The [Statsmodels Project](http://statsmodels.sourceforge.net) has two excellent collections of examples: [in their official documentation](http://statsmodels.sourceforge.net/devel/examples/index.html) and [extra ones in their wiki](https://github.com/statsmodels/statsmodels/wiki/Examples#user-contributed-examples). Too many there to directly duplicate here, but they provide great learning materials on statistical modeling with Python. + +* [Machine Learning with the Shogun Toolbox](http://cloud.shogun-toolbox.org). This is a complete *service* that includes a ready-to-run IPython instance with a collection of notebooks illustrating the use of the [Shogun Toolbox](http://shogun-toolbox.org). Just log in and start running the examples. + +* [Python for Data Analysis](https://github.com/ResearchComputing/Meetup-Fall-2013), an introductory collection from the [CU Boulder Research Computing Group](http://researchcomputing.github.io/). + +* [The Kaggle bulldozers competition example](http://danielfrg.github.io/blog/2013/03/07/kaggle-bulldozers-basic-cleaning), one of a set on tutorials on exploratory data analysis with the [copper toolkit](https://github.com/danielfrg/copper#copper) by [Daniel Rodríguez](http://danielfrg.github.io)/ + +* [Understanding model reliability](http://nbviewer.ipython.org/github/mwaskom/Psych216/blob/master/week6_tutorial.ipynb), part of a complete [course on statistics and data analysis for psychologists](https://github.com/mwaskom/Psych216) by [Michael Waskom](https://github.com/mwaskom). + +* [Graphical Representations of Linear Models](http://nbviewer.ipython.org/github/mwaskom/seaborn/blob/master/examples/linear_models.ipynb), an illustration of the [Seaborn statistical visualization library](https://github.com/mwaskom/seaborn), that also includes [Visualizing distributions of data](http://nbviewer.ipython.org/github/mwaskom/seaborn/blob/master/examples/plotting_distributions.ipynb) and [Representing variability in timeseries plots](http://nbviewer.ipython.org/github/mwaskom/seaborn/blob/master/examples/timeseries_plots.ipynb). By [Michael Waskom](https://github.com/mwaskom). + +* [Desperately Seeking Silver](http://nbviewer.ipython.org/github/cs109/content/blob/master/HW2.ipynb), one of the homework sets for Harvard's [CS 109 Data Science course](http://cs109.org). + +* [IPython Notebooks for 'An Introduction to Statistical Learning with Applications in R'](https://github.com/JWarmenhoven/ISLR-python), Python code for a selection of tables, figures and LAB sections from the book by James, Witten, Hastie, Tibshirani (2013). + +* [Python Notebooks for StatLearning Exercises](https://github.com/sujitpal/statlearning-notebooks), Python implementations of the R labs for the [StatLearning: Statistical Learning](https://class.stanford.edu/courses/HumanitiesScience/StatLearning/Winter2014/about) online course from Stanford University taught by Profs Trevor Hastie and Rob Tibshirani. + +* [Applied Predictive Modeling with Python](http://nbviewer.ipython.org/github/leig/Applied-Predictive-Modeling-with-Python/tree/master/notebooks/), Python implementations of the examples (originally written in R) from a famous introductory book, [Applied Predictive Modeling](http://appliedpredictivemodeling.com), by Max Kuhn and Kjell Johnson. + +* A collection of [four courses in foundations of data science, algorithms and databases](http://nbviewer.ipython.org/github/ledeprogram/courses/tree/master) from multiple faculty at [Columbia University's Lede Program](http://www.journalism.columbia.edu/page/1058-the-lede-program-an-introduction-to-data-practices/906). + +* [SciPy and OpenCV as an interactive computing environment for computer vision](http://nbviewer.ipython.org/github/thsant/scipy4cv) by [Thiago Santos](http://ttsantos.net), a tutorial presented at [SIBGRAPI 2014](http://emap.fgv.br/sibgrapi-2014/tutorials.html). + +* [Kalman and Bayesian Filters in Python](http://nbviewer.ipython.org/github/rlabbe/Kalman-and-Bayesian-Filters-in-Python/blob/master/table_of_contents.ipynb), by [Roger Labbe](https://github.com/rlabbe). + +* [Machine learning in Python](http://nbviewer.ipython.org/github/jdwittenauer/ipython-notebooks/blob/master/ML-Exercise1.ipynb), a series based on Andrew Ng's Coursera class on machine learning. Part of a [larger collection of data science notebooks](https://github.com/jdwittenauer/ipython-notebooks) by [John Wittenauer](https://github.com/jdwittenauer). + +* [An example machine learning notebook](http://nbviewer.jupyter.org/github/rhiever/Data-Analysis-and-Machine-Learning-Projects/blob/master/example-data-science-notebook/Example%20Machine%20Learning%20Notebook.ipynb), by [Randal. S. Olson](http://www.randalolson.com/), part of a [collection in Data Analysis and Machine Learning](http://nbviewer.jupyter.org/github/rhiever/Data-Analysis-and-Machine-Learning-Projects). + +### Mathematics, Physics, Chemistry, Biology + +* A [single-atom laser model](http://nbviewer.ipython.org/github/jrjohansson/qutip-lectures/blob/master/Lecture-2B-Single-Atom-Lasing.ipynb). This is one of a complete set of [lectures on quantum mechanics and quantum optics using QuTiP](http://nbviewer.ipython.org/github/jrjohansson/qutip-lectures) by [J.R. Johansson](https://github.com/jrjohansson). + +* [2-d rigid-body transformations](http://nbviewer.ipython.org/github/demotu/BMC/blob/master/notebooks/Transformation2D.ipynb). This is part of [Scientific Computing in Biomechanics and Motor Control](https://github.com/demotu/BMC), a complete collection of notebooks by [Marcos Duarte](https://github.com/demotu). + +* Astrophysical simulations and analysis with [yt](http://yt-project.org): a collection of example notebooks on using various codes that yt interfaces with: [Enzo](http://nbviewer.ipython.org/url/hub.yt-project.org/notebooks/ac275ee8a462425c93b36e330e243705.ipynb), [Gadget](http://nbviewer.ipython.org/url/hub.yt-project.org/notebooks/e209c55b6aaa4a9ab12c55422bb3afdc.ipynb), [RAMSES](http://nbviewer.ipython.org/url/hub.yt-project.org/notebooks/9fe5b4bc68ce48d5aa665edbc4ec3aa3.ipynb), [PKDGrav](http://nbviewer.ipython.org/url/hub.yt-project.org/notebooks/9fe5b4bc68ce48d5aa665edbc4ec3aa3.ipynb) and [Gasoline](http://nbviewer.ipython.org/url/hub.yt-project.org/notebooks/9fe5b4bc68ce48d5aa665edbc4ec3aa3.ipynb). *Note:* the yt site currently throws an SSL warning, they seem to have an outdated or self-signed certificate. + +* [Working with Reactions](http://nbviewer.ipython.org/gist/4316430), part of a set of tutorials on [cheminformatics and machine learning with the rdkit project](http://code.google.com/p/rdkit/wiki/UGM2012Tutorials), by Greg Landrum. + +* [CFD Python: 12 steps to Navier-Stokes](http://lorenabarba.com/blog/cfd-python-12-steps-to-navier-stokes). A complete set of lectures on Computational Fluid Dynamics, from 1-d linear waves to full 2-d Navier-Stokes, by [Lorena Barba](http://lorenabarba.com). + +* [AeroPython: Aerodynamics-Hydrodynamics with Python](https://github.com/barbagroup/AeroPython), a complete course taught at George Washington University by [Lorena Barba](http://lorenabarba.com). + +* [Practical Numerical Methods with Python](https://github.com/numerical-mooc/numerical-mooc), a collection of learning modules (each consisting of several IPython Notebooks) for a course in numerical differential equations taught at George Washington University by [Lorena Barba](http://lorenabarba.com). Also offered as a "massive, open online course" (MOOC) on the [GW SEAS Open edX](http://openedx.seas.gwu.edu/courses/GW/MAE6286/2014_fall/about) platform. + +* [pyuvvis: tools for explorative spectroscopy](https://github.com/hugadams/pyuvvis), spectroscopy library built for integration ipython notebooks, matplotlib and pandas. + +* [HyperPython: a practical introduction to the solution of hyperbolic conservation laws](http://nbviewer.ipython.org/github/ketch/HyperPython/tree/master/), a course by [David Ketcheson](http://davidketcheson.info). + +* [An Introduction to Applied Bioinformatics](http://caporasolab.us/An-Introduction-To-Applied-Bioinformatics): Interactive lessons in bioinformatics, by [Greg Caporaso](http://caporasolab.us). + +* Colour science computations with [colour](https://github.com/colour-science/colour), a Python package implementing a comprehensive number of colour theory transformations and algorithms supported by a [dedicated collection of IPython Notebooks](http://nbviewer.ipython.org/github/colour-science/colour-ipython/blob/master/notebooks/colour.ipynb). More colour science related [IPython Notebooks](http://nbviewer.ipython.org/github/colour-science/colour-website/tree/master/ipython/) are available on [colour-science.org](http://colour-science.org/). + +* The [notebooks](https://github.com/tiagoantao/bioinf-python/blob/master/notebooks/Welcome.ipynb) from the Book [Bioinformatics with Python Cookbook](http://www.amazon.com/Bioinformatics-Python-Cookbook-Tiago-Antao/dp/1782175113), covering several fields like Next-Generation Sequencing, Population Genetics, Phylogenetics, Genomics, Proteomics and Geo-referenced information. + +### Earth Science and Geo-Spatial data + +* [EarthPy](http://earthpy.org), a collection of IPython notebooks with a focus on Earth Sciences, from [whale tracks](http://earthpy.org/analyzing-whale-tracks.html) to the [flow of the Amazon](http://earthpy.org/earthpy-basemap-amazon.html). + +* [Python for Geosciences](https://github.com/koldunovn/python_for_geosciences), a tutorial series aimed at the Earth Sciences community, by [Nikolay Koldunov](http://earthpy.org/author/nikolay-koldunov.html). + +* [Find graffiti close to NY subway entrances](http://nbviewer.ipython.org/github/invisibleroads/crosscompute-tutorials/blob/master/computational-analysis/300%20Count%20graffiti%20sightings%20within%20100%20feet%20of%20a%20subway%20entrance.ipynb), one of a rich [collection of notebooks on large-scale data analysis](https://github.com/invisibleroads/crosscompute-tutorials), by [Roy Hyunjin Han](https://github.com/invisibleroads). + +* [Logistic models of well switching in Bangladesh](http://nbviewer.ipython.org/github/carljv/Will_it_Python/blob/master/ARM/ch5/arsenic_wells_switching.ipynb), part of the ["Will it Python"](http://slendrmeans.wordpress.com/will-it-python) blog series ([repo](https://github.com/carljv/Will_it_Python)) on Machine Learning and data analysis in Python. By Carl Vogel. + +* [Estimated likelihood of observing a large earthquake on a continental low‐angle normal fault and implications for low‐angle normal fault activity](http://nbviewer.ipython.org/github/cossatot/lanf_earthquake_likelihood/blob/master/notebooks/lanf_manuscript_notebook.ipynb), an executable version of a paper by Richard Styron and Eric Hetland published in *Geophysical Research Letters*, on earthquake probabilities. + +* [python4oceanographers](https://ocefpaf.github.io/python4oceanographers/), a blog demonstrating analyses in physical oceanography from [resource-demanding numerical computations with functions in compiled languages](https://ocefpaf.github.io/python4oceanographers/blog/2015/10/05/isosurfaces/) to specialized [tidal analysis](https://ocefpaf.github.io/python4oceanographers/blog/2015/02/16/utide/) to [visualization of various geo data using fancy things like interactive maps](https://ocefpaf.github.io/python4oceanographers/blog/2015/07/13/interactive_geo/). + +### Linguistics and Text Mining + +* [Workshop on text analysis](http://nbviewer.ipython.org/github/nealcaren/workshop_2014/tree/master/notebooks/) by [Neal Caren](http://nealcaren.web.unc.edu). + +* [Detecting Algorithmically Generated Domains](http://nbviewer.ipython.org/github/ClickSecurity/data_hacking/blob/master/dga_detection/DGA_Domain_Detection.ipynb), part of the [Data Hacking](http://clicksecurity.github.io/data_hacking) collection on security-oriented data analysis with IPython & friends. + +* [Mining the Social Web (2nd Edition)](https://github.com/ptwobrussell/Mining-the-Social-Web-2nd-Edition). A complete collection of notebooks accompanying [Matthew Russel's book](http://shop.oreilly.com/product/0636920030195.do) by O'Reilly. + +### Signal Processing + +* [Sound Analysis with the Fourier Transform](https://github.com/calebmadrigal/FourierTalkOSCON). A set of IPython Notebooks by [Caleb Madrigal](http://calebmadrigal.com) to explain what the Fourier Transform is and how to use it for basic audio processing applications. + +* [An introduction to Compressed Sensing](http://nbviewer.ipython.org/github/unpingco/Python-for-Signal-Processing/blob/master/Compressive_Sampling.ipynb), part of [Python for Signal Processing](http://nbviewer.ipython.org/github/unpingco/Python-for-Signal-Processing): an entire book (and [blog](http://python-for-signal-processing.blogspot.com)) on the subject by Jose Unpingco. + +* [Kalman and Bayesian Filters in Python](http://nbviewer.ipython.org/github/rlabbe/Kalman-and-Bayesian-Filters-in-Python/blob/master/table_of_contents.ipynb). A textbook and accompanying filtering library on the topic of Kalman filtering and other related Bayesian filtering techniques. + +* [Classify human movements using Dynamic Time Warping & K Nearest Neighbors:](http://nbviewer.ipython.org/github/markdregan/K-Nearest-Neighbors-with-Dynamic-Time-Warping/blob/master/K_Nearest_Neighbor_Dynamic_Time_Warping.ipynb) Signals from a smart phone gyroscope and accelerometer are used to classify if the person is running, walking, sitting standing etc. This IPython notebook contains a python implementation of DTW and KNN algorithms along with explanations and a practical application. + +* [Digital Signal Processing](https://github.com/spatialaudio/digital-signal-processing-lecture) A collection of notebooks that accompanies a masters course on the topic. + +### Engineering Education + +* [Introduction to Chemical Engineering Analysis](http://jckantor.github.io/CBE20255/) by [Jeff Kantor](http://jckantor.github.io/). A collection of IPython notebooks illustrating topics in introductory chemical engineering analysis, including stoichiometry, generation-consumption analysis, mass and energy balances. + +## Scientific computing and data analysis with the SciPy Stack +### General topics in scientific computing + +* [Algorithms in IPython notebooks](https://github.com/rasbt/algorithms_in_ipython_notebooks), by [Sebastian Raschka](https://github.com/rasbt) + +* [Comparing the performance of Python compilers - Cython vs. Numba vs. Parakeet](_nbs/day4_2_cython_numba_parakeet.ipynb?create=1), by [Sebastian Raschka](https://github.com/rasbt) + +* [A Crash Course in Python for Scientists](http://nbviewer.ipython.org/gist/5920182), by Sandia's [Rick Muller](http://www.cs.sandia.gov/~rmuller). + +* [A gentle introduction to scientific programming in Python, biased towards biologists](http://nbviewer.ipython.org/url/atwallab.cshl.edu/teaching/QBbootcamp3.ipynb), by [Mickey Atwal, Cold Spring Harbor Laboratory](http://atwallab.cshl.edu). + +* [Python for Data Science](http://nbviewer.ipython.org/github/gumption/Python_for_Data_Science/blob/master/Python_for_Data_Science_all.ipynb), a self-contained mini-course with exercises, by [ +Joe McCarthy](http://interrelativity.com/joe). + +* [First few lectures of the UW/Coursera course on Data Analysis](http://nbviewer.ipython.org/github/fonnesbeck/ComputationalMethodsCourse/blob/master/Lecture%201.ipynb). ([Repo](https://github.com/fonnesbeck/ComputationalMethodsCourse)) by Chris Fonnesbeck. + +* [CythonGSL: a Cython interface for the GNU Scientific Library (GSL)](http://nbviewer.ipython.org/github/twiecki/CythonGSL/blob/master/examples/cython_gsl_ipythonnb.ipynb) ([Project repo](https://github.com/twiecki/CythonGSL), by Thomas Wiecki. + +* [Using Numba to speed up numerical codes](http://nbviewer.ipython.org/gist/3914904). And another Numba example: [self-organizing maps](http://nbviewer.ipython.org/gist/3407544). + +* [Numpy performance tricks](http://nbviewer.ipython.org/gist/4645217), and [blog post](http://cyrille.rossant.net/numpy-performance-tricks/), by [Cyrille Rossant](http://cyrille.rossant.net/). + +* [IPython Parallel Push/Execute/Pull Demo](http://nbviewer.ipython.org/gist/3866987) by Justin Riley. + +* [Understanding the design of the R "formula" objects](http://nbviewer.ipython.org/url/perrin.dynevor.org/exploring_r_formula.ipynb) by Matthew Brett. + +* [Comparing different approaches to evolutionary simulations](http://nbviewer.ipython.org/url/www.slideviper.oquanta.info/test/simulation%20frameworks.ipynb). Also available [here](http://slideviewer.herokuapp.com/urls/github.com/yoavram/ipython-notebooks/raw/master/simulation%20frameworks.ipynb) to better visualization. The notebook was converted to a HTML presentation using an old nbconvert with the first developing implementation of `reveal` converter. By Yoav Ram. + +* [The Traveling Salesperson Problem](http://nbviewer.ipython.org/url/norvig.com/ipython/TSPv3.ipynb) by [Peter Norvig](http://norvig.com). + +* [A git tutorial targeted at scientists](http://nbviewer.ipython.org/github/fperez/reprosw/blob/master/Version%20Control.ipynb) by Fernando Perez. + +* [Running MATLAB in an IPython Notebook](http://nbviewer.ipython.org/github/plotly/IPython-plotly/blob/master/See%20more/MATLAB_Wrapper_Examples.ipynb), using [pymatbridge](https://pypi.python.org/pypi/pymatbridge). + +* [Interactive Curve-Fitting](http://nbviewer.ipython.org/gist/danielballan/1c2db3d4f2f7780cf52f) The `lmfit` package provides a widget-based interface to the curve-fitting algorithms in SciPy. + +* [A visual guide to the Python Spark API for distributed computing](http://nbviewer.ipython.org/github/jkthompson/pyspark-pictures/blob/master/pyspark-pictures.ipynb) by Jeff Thompson + +### Social data + +* [Survival Analysis](http://nbviewer.ipython.org/github/CamDavidsonPilon/lifelines/blob/master/docs/Survival%20Analysis%20intro.ipynb), an illustration of the [lifelines library](https://github.com/CamDavidsonPilon/lifelines), by [Cam Davidson Pilon](https://github.com/CamDavidsonPilon). + +* A reconstruction of [Nate Silver's 538 model for the 2012 US Presidential Election](http://nbviewer.ipython.org/github/jseabold/538model/blob/master/silver_model.ipynb), by Skipper Seabold (complete [repo](https://github.com/jseabold/538model)). + +* [Data about the Sandy Hook massacre in Newtown, Conneticut](http://nbviewer.ipython.org/url/www.brianckeegan.com/data/Shootings/Shootings.ipynb), which accompanies a more detailed [blog post on the subject](http://www.brianckeegan.com/2012/12/sandy-hook-school-massacre/). Here are the [notebook and accompanying data](http://www.brianckeegan.com/data/Shootings/Shootings.ipynb). By Brian Keegan. + +* More on [gun violence analysis with Wikipedia data](http://nbviewer.ipython.org/gist/4358066). + +* [An analysis of the Gaza-Israel 2012 crisis](http://nbviewer.ipython.org/gist/4121857). + +* [Ranking NFL Teams](http://nbviewer.ipython.org/github/seanjtaylor/NFLRanking/blob/master/NFL%20Rankings.ipynb). The [full repo](https://github.com/seanjtaylor/NFLRanking) also includes an explanatory slideshow. By Sean Taylor. + +* [Automated processing of news media and generation of associated imagery](http://nbviewer.ipython.org/url/mhermans.net/files/tmp/demo_rdf_HLN.ipynb). + +* [An analysis of national school standardized test data in Colombia using Pandas](http://nbviewer.ipython.org/url/finiterank.com/saber/saber.ipynb) (in Spanish). By [Javier Moreno](http://finiterank.com). + +* [Getting started with GDELT](http://nbviewer.ipython.org/github/dmasad/GDELT_Intro/blob/master/Getting_Started_with_GDELT.ipynb), by [David Masad](https://github.com/dmasad). [GDELT](http://eventdata.psu.edu/data.dir/GDELT.html) is a dataset containing more than 200-million geolocated events with global coverage for 1979 to the present. Another GDELT example from David, that nicely [integrates mapping visualizations](http://nbviewer.ipython.org/github/dmasad/GDELT_Intro/blob/master/GDELT_Mapping.ipynb). + +* [Titanic passengers, coal mining disasters, and vessel speed changes](http://nbviewer.ipython.org/gist/fonnesbeck/8495259), by [Christopher Fonnesbeck](http://biostat.mc.vanderbilt.edu/wiki/Main/ChrisFonnesbeck) + +* A geographic analysis of [Indonesian conflicts in 2012](http://nbviewer.ipython.org/github/herrfz/gdelt/blob/master/indn_connection.ipynb) with GDELT, by [herrfz](https://github.com/herrfz). + +* [Bioinformatic Approaches to the Computation of Poetic Meter](http://nbviewer.ipython.org/github/asp49/meter/blob/graph/Shared%20Horizons%20Presentation.ipynb), by [A. Sean Pue](https://github.com/asp49), [C. Titus Brown](https://github.com/ctb) and [Tracy Teal](https://github.com/tracyteal). + +* Analyzing the [Vélib dataset from Paris](http://nbviewer.ipython.org/gist/5520933), by [Cyrille Rossant](https://github.com/rossant) (Vélib is Paris' [bicycle-sharing program](http://en.wikipedia.org/wiki/V%C3%A9lib')). + +* [Using Python to see how the Times writes about men and women](http://nbviewer.ipython.org/gist/5105037), by [Neal Caren](https://twitter.com/HaphazardSoc). + +* [Exploring graph properties of the Twitter stream with twython and NetworkX](http://nbviewer.ipython.org/gist/5681541/TwitterGraphs.ipynb), by [F. Perez](http://fperez.org) (complete [gist repo with utilities here](https://gist.github.com/fperez/5681541).) + +* [Kaggle Competition: Titanic Machine Learning from Disaster](http://nbviewer.ipython.org/github/agconti/kaggle-titanic/blob/master/Titanic.ipynb). By [Andrew Conti](https://github.com/agconti). + +* [How clean are San Francisco's restaurants?](http://nbviewer.ipython.org/github/Jay-Oh-eN/happy-healthy-hungry/blob/master/h3.ipynb), a data science tutorial that accompanies a [blog post](http://blog.zipfianacademy.com/post/57158627293/how-to-data-science-mapping-sf-restaurant-inspection) from [Zipfian Academy](http://zipfianacademy.com/). + +* [NYT gender wage gap and US crime by state](http://nbviewer.ipython.org/github/plotly/IPython-plotly/blob/master/Bubble%20Chart%20Explorer.ipynb). + +* [Predicting usage of the subway system in NYC](http://nbviewer.ipython.org/url/www.asimihsan.com/articles/Intro%20to%20Data%20Science%20-%20Final%20Project.ipynb), a [final project](http://blog.udacity.com/2014/05/intro-to-data-science-tools-to-ask.html) for the Udacity Intro to Data Science Course, by [Asim Ihsan](http://www.asimihsan.com). + +* [An exploratory statistical analysis of the 2014 World Cup Final](http://nbviewer.ipython.org/github/rjtavares/football-crunching/blob/master/notebooks/an%20exploratory%20data%20analysis%20of%20the%20world%20cup%20final.ipynb), by [Ricardo Tavares](https://github.com/rjtavares). Part of a [notebook collection on football (aka soccer) analysis](http://nbviewer.ipython.org/github/rjtavares/football-crunching/tree/master/). + +* [San Francisco's Drug Geography](http://nbviewer.ipython.org/github/lmart999/GIS/blob/master/SF_GIS_Crime.ipynb), a GIS analysis of public crime data in SF, by [Lance Martin](https://github.com/lmart999). + +### Psychology and Neuroscience +* [Cue Combination with Neural Populations](http://nbviewer.ipython.org/github/wtadler/cue-combination-with-neurons/blob/master/neural_cue_combination.ipynb) by [Will Adler](http://www.wtadler.com). Intuition and simulation for the theory (Ma *et al.*, 2006) that through probabilistic population codes, neurons can perform optimal cue combination with simple linear operations. Demonstrates that variance in cortical activity, rather than impairing sensory systems, is an adaptive mechanism to encode uncertainty in sensory measurements. + +* [Modeling psychophysical data with non-linear functions](http://nbviewer.ipython.org/github/arokem/teach_optimization/blob/master/optimization.ipynb) by Ariel Rokem. + +* [Visualizing mathematical models of brain cell connections](http://nbviewer.ipython.org/github/jonasnick/ReceptiveFields/blob/master/receptiveFields.ipynb). The effect of convolution of different receptive field functions and natural images is examined. + +* [Python for Vision Research](http://nbviewer.ipython.org/github/gestaltrevision/python_for_visres/blob/master/index.ipynb). A three-day crash course for vision researchers in programming with Python, building experiments with [PsychoPy](http://psychopy.org/) and [psychopy_ext](http://psychopy_ext.klab.lt/), learning the fMRI multi-voxel pattern analysis with [PyMVPA](http://www.pymvpa.org/), and understading image processing in Python. + +* [Loading and visualizing fMRI data](http://nbviewer.ipython.org/github/GaelVaroquaux/nilearn_course/blob/master/rendered_notebooks/1_Introduction.ipynb), part of the [Functional connectivity with NiLearn course](https://github.com/GaelVaroquaux/nilearn_course) by [Gaël Varoquaux](http://gael-varoquaux.info). + +### Machine Learning, Statistics and Probability + +* [An introduction to parallel machine learning with sklearn, joblib and IPython.parallel](http://nbviewer.ipython.org/gist/5115540/Model%20Selection%20for%20the%20Nystroem%20Method.ipynb), a notebook that accompanies this [slide deck](https://speakerdeck.com/ogrisel/parallel-and-large-scale-machine-learning-with-scikit-learn) by [Olivier Grisel](https://github.com/ogrisel). + +* [A tutorial introduction to machine learning with sklearn](http://amueller.github.com/sklearn_tutorial), an IPython-based slide deck by [Andreas Mueller](https://github.com/amueller). + +* [Introduction to Machine Learning in Python with scikit-learn](http://ipython-books.github.io/featured-04/) by [Cyrille Rossant](http://cyrille.rossant.net/), a free recipe from the [IPython Cookbook](http://ipython-books.github.io/cookbook/), a comprehensive guide to Python for Data Science. + +* [An introduction to Predictive Modeling in Python](http://nbviewer.ipython.org/github/ogrisel/parallel_ml_tutorial/blob/master/solutions/01%20-%20An%20Introduction%20to%20Predictive%20Modeling%20in%20Python.ipynb), by [Olivier Grisel](https://github.com/ogrisel). + +* [Face Recognition on a subset of the Labeled Faces in the Wild dataset](http://nbviewer.ipython.org/github/ogrisel/notebooks/blob/master/Labeled%20Faces%20in%20the%20Wild%20recognition.ipynb), by [Olivier Grisel](http://github.com/ogrisel). + +* [An Introduction to Bayesian Methods for Multilevel Modeling](http://nbviewer.ipython.org/github/fonnesbeck/multilevel_modeling/blob/master/multilevel_modeling.ipynb), by [Chris Fonnesbeck](http://github.com/fonnesbeck). + +* [Introduction to Bayesian Networks](http://nbviewer.ipython.org/github/kuitang/hackny-bayesnet/blob/master/hackNY%20Bayesian%20Network%20Demo.ipynb) by [Kui Tang](https://github.com/kuitang) + +* [Bayesian data analysis with PyMC3](http://nbviewer.ipython.org/github/twiecki/pymc3_talk/blob/master/bayesian_pymc3.ipynb) by [Thomas Wiecki](https://github.com/twiecki). + +* [A collection of examples for solving pattern classification problems](https://github.com/rasbt/pattern_classification), by [Sebastian Raschka](https://github.com/rasbt). + +* [Introduction to Linear Regression using Python](http://nbviewer.ipython.org/github/justmarkham/DAT4/blob/master/notebooks/08_linear_regression.ipynb) by [Kevin Markham](https://github.com/justmarkham) + +* [Machine learning in Python](http://nbviewer.ipython.org/github/jdwittenauer/ipython-notebooks/blob/master/ML-Exercise1.ipynb), a series based on Andrew Ng's Coursera class on machine learning. Part of a [larger collection of data science notebooks](https://github.com/jdwittenauer/ipython-notebooks) by [John Wittenauer](https://github.com/jdwittenauer). + +* [Probability, Paradox, and the Reasonable Person Principle](http://nbviewer.ipython.org/url/norvig.com/ipython/Probability.ipynb), by Peter Norvig. + +### Physics, Chemistry and Biology + +* [Multibody dynamics and control with Python](http://www.moorepants.info/blog/npendulum.html) and the [notebook file](http://nbviewer.ipython.org/github/gilbertgede/idetc-2013-paper/blob/master/n-pendulum-control.ipynb) by Jason K. Moore. + +* [Manipulation and display of chemical structures](http://nbviewer.ipython.org/gist/4316433), by [Greg Landrum](https://github.com/greglandrum), using rdkit. + +* [The sound of Hydrogen](http://nbviewer.ipython.org/github/Carreau/posts/blob/master/07-the-sound-of-hydrogen.ipynb), visualizing and listening to the quantum-mechanical spectrum of Hydrogen. By [Matthias Bussonnier](http://github.com/Carreau). + +* Particle physics at the Large Hadron Collider (LHC): using [ROOT in an LHCb masterclass](http://root.cern.ch/drupal/content/ipython-notebooks-and-root-0): [Notebook 1](http://nbviewer.ipython.org/github/mazurov/webfest2013/blob/master/notebooks/MasterClassD0-ex1.ipynb) and [Notebook 2](http://nbviewer.ipython.org/github/mazurov/webfest2013/blob/master/notebooks/MasterClassD0-ex2%2Cex3.ipynb) notebooks by Alexander Mazurov and Andrey Ustyuzhanin at CERN. + +* [A Reaction-Diffusion Equation Solver in Python with Numpy](http://nbviewer.ipython.org/github/waltherg/notebooks/blob/master/2013-12-03-Crank_Nicolson.ipynb), a demonstration of how IPython notebooks can be used to discuss both the theory and implementation of numerical algorithms on one page, by [Georg Walther](https://github.com/waltherg). + +* [Comparing different approaches to evolutionary simulations](http://nbviewer.ipython.org/url/www.slideviper.oquanta.info/test/simulation%20frameworks.ipynb). Also available [here](http://slideviewer.herokuapp.com/urls/github.com/yoavram/ipython-notebooks/raw/master/simulation%20frameworks.ipynb) to better visualization. The notebook was converted to a HTML presentation using an old nbconvert with the first developing implementation of `reveal` converter. By Yoav Ram. + +### Economics and Finance + +* [Replication of the highly-contentious analysis of economic growth by Reinhart and Rogoff](http://nbviewer.ipython.org/github/vincentarelbundock/Reinhart-Rogoff/blob/master/reinhart-rogoff.ipynb), by [Vincent Arel-Bundock](http://umich.edu/~varel), [full repo here](https://github.com/vincentarelbundock/Reinhart-Rogoff). This is based on the [widely-publicized critique of the original analysis done by Herndon, Ash, and Pollin](http://www.peri.umass.edu/236/hash/31e2ff374b6377b2ddec04deaa6388b1/publication/566). + +* [fecon235 for Financial Economics](https://github.com/rsvp/fecon235) series of notebooks which examines time-series data for economics and finance. Easy API to freely access data from the Federal Reserve, SEC, CFTC, stock and futures exchanges. Thus research from older notebooks can be replicated, and updated using the most current data. For example, this notebook forecasts likely Fed policy for setting the [Fed Funds rate](https://git.io/fedfunds), but market sentiment across major asset classes is observable from the [CFTC Commitment of Traders Report](https://git.io/cotr). Major economics indicators are renormalized: for example, various measures of [inflation](https://git.io/infl), optionally with the forward-looking break-even rates derived from U.S. Treasury bonds. Other notebooks examine international markets: especially, gold and foreign exchange. + +### Earth science and geo-spatial data +* [Exploring seafloor habitats: geographic analysis using IPython Notebook with GRASS & R](http://nbviewer.ipython.org/gist/7598354). This embeds a slideshow and a Web Spinning Globe (Cesium) in the notebook. By Massimo Di Stefano. + +* [Geo-Spatial Data with IPython](http://nbviewer.ipython.org/github/mqlaql/geospatial-data/blob/master/Geospatial-Data-with-Python.ipynb). Tutorial by [Kelsey Jordahl](http://kjordahl.net/) from SciPy2013. + +### Data visualization and plotting + +* A [Notebook](http://nbviewer.ipython.org/github/plotly/python-user-guide/blob/master/s3_bubble-charts/s3_bubble-charts.ipynb) with an interactive Hans Rosling Gapminder bubble chart from [Plotly](https://plot.ly/api/Python). + +* [Data and visualization integration via web based resources](http://tw.rpi.edu/media/2013/09/25/a48/The_Perfect_Storm_1991.html). Using NetCDF, Matplotlib, IPython Parallel and ffmpeg to generate video animation from time series of gridded data. By Massimo Di Stefano. + +* [21 Interactive, D3 Plots from matplotlib, ggplot for Python, +prettyplotlib, Stack Overflow, and seaborn](http://nbviewer.ipython.org/gist/msund/7ac1203ded66fe8134cc). + +* [Visualizing complex-valued functions with Matplotlib and Mayavi](http://nbviewer.ipython.org/github/empet/Math/blob/master/DomainColoring.ipynb), by [Emilia Petrisor](https://github.com/empet). + +* [A D3 Viewer for Matplotlib Visualizations](http://jakevdp.github.io/blog/2013/12/19/a-d3-viewer-for-matplotlib/), different from above by not depending on Plot.ly account. + +* [Bokeh is an interactive web visualization library for Python](http://nbviewer.ipython.org/github/damianavila/bokeh_overview/blob/master/Bokeh%20Overview.ipynb?create=1) (and other languages). It provides d3-like novel graphics, over large datasets, all without requiring any knowledge of Javascript. It also has a Matplotlib compatibility layer. + +* [Winner of the 2014 E. Tufte Slope Graphs contest](http://nbviewer.ipython.org/gist/pascal-schetelat/8382651), by [Pascal Schetelat](https://github.com/pascal-schetelat). The original [contest info on Tufte's site](http://www.edwardtufte.com/bboard/q-and-a-fetch-msg?msg_id=0003nk). + +* [matta, d3.js-based visualizations in the IPython Notebook](http://nbviewer.ipython.org/github/carnby/matta/blob/master/examples/Basic%20Examples.ipynb), by [Eduardo Graells-Garrido](http://carnby.github.io/). + +### Mathematics + +* [Linear algebra with Cython](http://nbviewer.ipython.org/github/carljv/cython_testing/blob/master/cython_linalg.ipynb). A tutorial that styles the notebook differently to show that you can produce high-quality typography online with the Notebook. By Carl Vogel. + +* [Exploring how smooth-looking functions can have very surprising derivatives even at low orders](http://nbviewer.ipython.org/url/finiterank.com/cuadernos/suavesylocas.ipynb), combining SymPy and matplotlib. By [Javier Moreno](http://finiterank.com). + +* [A Collection of Applied Mathematics and Machine Learning Tutorials](http://sayilarvekuramlar.blogspot.co.uk/2015/12/matematik-ders-notlari.html) (in Turkish). By Burak Bayramli. + +* [Function minimization with iminuit](http://nbviewer.ipython.org/github/iminuit/iminuit/blob/master/tutorial/tutorial.ipynb), an introductory companion to their [hard core tutorial](http://nbviewer.ipython.org/github/iminuit/iminuit/blob/master/tutorial/hard-core-tutorial.ipynb). By the [iminuit project](http://iminuit.github.io/iminuit). + +* [The Discrete Cosine Transform](http://nbviewer.ipython.org/url/cs.marlboro.edu/courses/spring2014/information/code/dct/dct.ipynb), a brief explanation and illustration of the math behind the DCT and its role in the JPEG image format, by [Jim Mahoney](http://cs.marlboro.edu). + +* [Chebfun in Python](http://nbviewer.ipython.org/gist/6724986), a demo of [PyChebfun](https://github.com/cswiercz/pychebfun), by [Olivier Verdier](http://www.olivierverdier.com). PyChebfun is a pure-python implementation of the celebrated [Chebfun package by Battles and Trefethen](http://people.maths.ox.ac.uk/trefethen/publication/PDF/2004_107.pdf). + +* [The Matrix Exponential] +(http://nbviewer.ipython.org/github/sdrelton/matrix_function_notebooks/blob/master/TheMatrixExponential.ipynb), an introduction to the matrix exponential, its applications, and a list of available software in Python and MATLAB. By [Sam Relton](http://www.maths.manchester.ac.uk/~srelton/). + +* [Fractals, complex numbers, and your imagination](http://nbviewer.ipython.org/github/cfangmeier/ipython_notebooks/blob/master/Imagination.ipynb), by [Caleb Fangmeier](https://github.com/cfangmeier). + +* [A SymPy tutorial](http://nbviewer.ipython.org/url/www.inp.nsk.su/~grozin/python/sympy.ipynb), by [Andrey Grozin](http://www.inp.nsk.su/~grozin/). + +### Signal and Sound Processing + +* [Simulation of Delta Sigma modulators in Python](http://nbviewer.ipython.org/github/ggventurini/python-deltasigma/blob/master/examples/dsdemo1.ipynb) with [deltasigma](https://github.com/ggventurini/python-deltasigma), Python port of of Richard Schreier's *excellent* [MATLAB Delta Sigma Toolbox](http://www.mathworks.com/matlabcentral/fileexchange/19-delta-sigma-toolbox), by [Giuseppe Venturini](https://github.com/ggventurini). Several demonstrative notebooks on the package [README](https://github.com/ggventurini/python-deltasigma/blob/master/README.md). + +* [PyOracle: Automatic analysis of musical structure](http://nbviewer.ipython.org/urls/bitbucket.org/pucktronix/pyoracle/raw/d046b2bcf473503fa356094cfe4cff774d1aaefc/270D.ipynb), by [Greg Surges](http://gregsurges.com). + +* [A Gallery of SciPy's Window Functions for quick visual inspection and comparison] (http://nbviewer.ipython.org/urls/gist.githubusercontent.com/jaidevd/b7d865f7f4b237ab5181/raw/30bc8f998bf8f924b56b32ce10acce125656ed7c/scipy_window_gallery.ipynb) + by [Jaidev Deshpande](http://twitter.com/jaidevd) + +### Natural Language Processing + +* [Python Programming for the Humanities](http://fbkarsdorp.github.io/python-course/) by Folgert Karsdorp & Maarten van Gompel. + + + + +### Pandas for data analysis + +Note that in the 'collections' section above there are also pandas-related links, such as the one for an [11-lesson tutorial](https://bitbucket.org/hrojas/learn-pandas). + +* [A 10-minute whirlwind tour of pandas](http://nbviewer.ipython.org/gist/wesm/4757075/PandasTour.ipynb), this is the notebook accompanying a [video presentation](http://vimeo.com/59324550) by Wes McKinney, author of Pandas and the [Python for Data Analysis](http://www.amazon.com/Python-Data-Analysis-Wes-McKinney/dp/1449319793) book. + +* [Time-series analysis with Pandas](http://nbviewer.ipython.org/github/changhiskhan/talks/blob/master/pydata2012/pandas_timeseries.ipynb). + +* [Financial data analysis with Pandas](http://nbviewer.ipython.org/gist/3962843). + +* [Clustering of smartphone sensor data for human activity detection using pandas and scipy](http://nbviewer.ipython.org/github/herrfz/dataanalysis/blob/master/week4/clustering_example.ipynb), part of Coursera data analysis course, done in Python ([repo](https://github.com/herrfz/dataanalysis)). + +* [Log analysis with Pandas](http://nbviewer.ipython.org/url/taaviburns.ca/presentations/log_analysis_with_pandas/nb/5-Scatterplots.ipynb), part of a [group presented at PyConCa 2012](http://taaviburns.ca/presentations/log_analysis_with_pandas/) by Taavi Burns. + +* [Analyzing and visualizing sun spot data with Pandas](http://nbviewer.ipython.org/gist/4569783), by [Josh Hemann](https://github.com/jhemann). An enlightening discussion of how naive plotting choices subtly influence our interpretation of data. + +* [Advanced analysis of Apache logs](http://nbviewer.ipython.org/github/koldunovn/nk_public_notebooks/blob/master/Apache_log.ipynb), by [Nikolay Koldunov](https://github.com/koldunovn). + +* [Statistical Data Analysis in Python] (https://github.com/fonnesbeck/statistical-analysis-python-tutorial), by [Christopher Fonnesbeck](https://github.com/fonnesbeck/), SciPy 2013. Companion videos [1](https://www.youtube.com/watch?v=DXPwSiRTxYY), [2](https://www.youtube.com/watch?v=TGEBpzJUxdI), [3](https://www.youtube.com/watch?v=YZDtBEEZuAk), [4](https://www.youtube.com/watch?v=5_rcdhBXD-0) + +## General Python Programming + +* [Learning to code with Python](http://nbviewer.ipython.org/urls/bitbucket.org/amjoconn/watpy-learning-to-code-with-python/raw/3441274a54c7ff6ff3e37285aafcbbd8cb4774f0/notebook/Learn%20to%20Code%20with%20Python.ipynb), part of an [introduction to Python](https://bitbucket.org/amjoconn/watpy-learning-to-code-with-python/src) from the [Waterloo Python users group](http://watpy.ca/blog/post/learn-code-python-review-feb-2013). + +* [Python Descriptors Demystified](http://nbviewer.ipython.org/gist/ChrisBeaumont/5758381/descriptor_writeup.ipynb), an in-depth discussion of the descriptor protocol in Python, by [Chris Beaumont](http://chrisbeaumont.org). + +* [A collection of not so obvious Python stuff you should know!](http://nbviewer.ipython.org/github/rasbt/python_reference/blob/master/tutorials/not_so_obvious_python_stuff.ipynb?create=1), by [Sebastian Raschka](https://github.com/rasbt). + +* [Key differences between Python 2.7.x and Python 3.x](http://nbviewer.ipython.org/github/rasbt/python_reference/blob/master/tutorials/key_differences_between_python_2_and_3.ipynb), by [Sebastian Raschka](https://github.com/rasbt). + +* [A beginner's guide to Python's namespaces, scope resolution, and the LEGB rule](http://nbviewer.ipython.org/github/rasbt/python_reference/blob/master/tutorials/scope_resolution_legb_rule.ipynb?create=1), by [Sebastian Raschka](https://github.com/rasbt). + +* [Sorting CSV files using the Python csv module](http://nbviewer.ipython.org/github/rasbt/python_reference/blob/master/tutorials/sorting_csvs.ipynb), by [Sebastian Raschka](https://github.com/rasbt). + +* Python 3 OOP series by [Leonardo Giordani](https://github.com/lgiordani): [Part 1: Objects and types](http://nbviewer.ipython.org/github/lgiordani/blog_source/blob/master/pelican/content/notebooks/Python_3_OOP_Part_1__Objects_and_types.ipynb), [Part 2: Classes and members](http://nbviewer.ipython.org/github/lgiordani/blog_source/blob/master/pelican/content/notebooks/Python_3_OOP_Part_2__Classes_and_members.ipynb), [Part 3: Delegation - composition and inheritance](http://nbviewer.ipython.org/github/lgiordani/blog_source/blob/master/pelican/content/notebooks/Python_3_OOP_Part_3__Delegation__composition_and_inheritance.ipynb), [Part 4: Polymorphism](http://nbviewer.ipython.org/github/lgiordani/blog_source/blob/master/pelican/content/notebooks/Python_3_OOP_Part_4__Polymorphism.ipynb), [Part 5: Metaclasses](http://nbviewer.ipython.org/github/lgiordani/blog_source/blob/master/pelican/content/notebooks/Python_3_OOP_Part_5__Metaclasses.ipynb), [Part 6: Abstract Base Classes](http://nbviewer.ipython.org/github/lgiordani/blog_source/blob/master/pelican/content/notebooks/Python_3_OOP_Part_6__Abstract_Base_Classes.ipynb) + +## Notebooks in languages other than Python + +These are notebooks that use [one of the IPython kernels for other languages](IPython kernels for other languages): + +### Julia + +The IPython protocols to communicate between kernels and clients are language agnostic, and other programming language communities have started to build support for this protocol in their language. The Julia team has created [IJulia](https://github.com/JuliaLang/IJulia.jl), and these are some Julia notebooks: + +* [Fractals 3 ways](http://nbviewer.ipython.org/url/beowulf.csail.mit.edu/18.337/fractals.ipynb), by [Jeff Bezanson](https://github.com/JeffBezanson). + +* [The Design Impact of Multiple Dispatch](http://nbviewer.ipython.org/gist/StefanKarpinski/b8fe9dbb36c1427b9f22), a detailed explanation of Julia's multiple dispatch design, by [Stefan Karpinski](https://github.com/StefanKarpinski). + +* A [tutorial](http://nbviewer.ipython.org/gist/7551139) on making interactive graphs with [Plotly](https://plot.ly/) and Julia. + +* [Numerical tours in Julia](http://www.numerical-tours.com/julia/) + +* [Functional Geometry](http://nbviewer.ipython.org/github/shashi/ijulia-notebooks/blob/master/funcgeo/Functional%20Geometry.ipynb) by [Shashi Gowda](https://github.com/shashi) + +* [JuliaOpt notebooks](http://nbviewer.ipython.org/github/JuliaOpt/juliaopt-notebooks/tree/master/notebooks/), a collection of optimization-related notebooks. + +* Coursework using IJulia notebooks: + - [Métodos Numéricos Avanzados (2015-2)](https://github.com/dpsanders/MetodosNumericosAvanzados), Luis Benet and David P. Sanders + - [Métodos Monte Carlo](https://github.com/dpsanders/metodos-monte-carlo), David Sanders + - [Linear Partial Differential Equations: Analysis and Numerics](http://math.mit.edu/~stevenj/18.303/), Steven G. Johnson + - [Julia tutorial for Computational Molecular Biology](http://cs.brown.edu/courses/csci1810/Julia_Tutorial.html), Younhun Kim and Matthew Reyna + +* Other collections of IJulia notebooks: + - [Jiahao Chen](http://jiahao.github.io/code/) + - [Christoph Ortner](https://homepages.warwick.ac.uk/staff/C.Ortner/index.php?page=julia) + - [Crossing Language Barriers with Julia, Scipy, and IPython](https://github.com/stevengj/Julia-EuroSciPy14), presented at EuroSciPy '14 by Steven G. Johnson. + +### Haskell +There exists a Haskell kernel for IPython in the [IHaskell project](https://github.com/gibiansky/ihaskell). + +* [IHaskell Demo Notebook](http://nbviewer.ipython.org/github/gibiansky/IHaskell/blob/master/notebooks/IHaskell.ipynb) +* [Homophone reduction](http://nbviewer.ipython.org/github/gibiansky/IHaskell/blob/master/notebooks/Homophones.ipynb), a solution to a cute problem involving treating English letters as generators of a large group. +* [Gradient descent typeclass](http://nbviewer.ipython.org/github/gibiansky/IHaskell/blob/master/notebooks/Gradient-Descent.ipynb), a look at how arbitrary gradient descent algorithms can be represented with a typeclass. + +### OCaml + +[iocaml](https://github.com/andrewray/iocaml) is an OCaml kernel for IPython + +* [H.261 Video Decoding in OCaml](https://andrewray.github.io/iocamljs/oh261.html) +* [OCaml implementation of the 2048 game](http://gazagnaire.org/fuconf14/) + +### Ruby + +Similar to the Julia kernel there exists also a [Ruby kernel](https://github.com/SciRuby/iruby) for IPython. + +* [IRuby Demo Notebook](http://nbviewer.ipython.org/github/SciRuby/sciruby-notebooks/blob/master/getting_started.ipynb) +* [SciRuby Notebooks](https://github.com/SciRuby/sciruby-notebooks) + +The interactive plotting library [Nyaplot](https://github.com/domitry/nyaplot) has some case studies using IRuby: + +* [War expenditure per GDP](http://nbviewer.ipython.org/github/domitry/nyaplot/blob/master/examples/notebook/Mapnya.ipynb#Case2-:Fill-countries-in-different-colors) +* [Finding shape consensus among multiple geometrical polygons](http://nbviewer.ipython.org/gist/mgiraldo/a68b53175ce5892531bc) + +### Perl + +* An example showcasing full use of the [display protocol](http://nbviewer.ipython.org/github/zmughal/zmughal-iperl-notebooks/blob/master/IPerl-demos/20150209_IPerl_display_demo.ipynb) with the IPerl kernel. + +## Miscellaneous topics about doing various things *with* the Notebook itself + +* [Blogging With IPython in Blogger](http://nbviewer.ipython.org/github/fperez/blog/blob/master/120907-Blogging%20with%20the%20IPython%20Notebook.ipynb), also available in [blog post form](http://blog.fperez.org/2012/09/blogging-with-ipython-notebook.html), [full repo here](https://github.com/fperez/blog). By Fernando Perez. + +* [Blogging With IPython in Octopress](http://nbviewer.ipython.org/urls/github.com/jakevdp/jakevdp.github.com/raw/master/downloads/notebooks/nb_in_octopress.ipynb), by Jake van der Plas and available as a [blog post](http://jakevdp.github.com/blog/2012/10/04/blogging-with-ipython/). Other [notebooks by Jake](https://github.com/jakevdp/jakevdp.github.com/tree/master/downloads/notebooks) contain many more great examples of doing interesting work with the scientific Python stack. + +* [Blogging With IPython in Nikola](http://nbviewer.ipython.org/url/www.damian.oquanta.info/posts/blogging-with-nikola-and-ipython.ipynb), also available in [blog post form](http://www.damian.oquanta.info/posts/blogging-with-nikola-and-ipython.html) by Damián Avila. + +* [Custom CSS control of the notebook](http://nbviewer.ipython.org/github/Carreau/posts/blob/master/Blog1.ipynb), this is part of a [blog repo](https://github.com/Carreau/posts) by Matthias Bussonnier. + +* [IPython display hookery: tools to help display visual output from various sources](http://nbviewer.ipython.org/gist/5162445), a gist by [@deeplook](https://github.com/deeplook). + +* [Importing IPython Notebooks as Modules](http://nbviewer.ipython.org/gist/6011986) by [Min RK](http://github.com/minrk). + +## Reproducible academic publications + +This section contains academic papers that have been published in the peer-reviewed literature or pre-print sites such as the [ArXiv](http://arxiv.org) that include one or more notebooks that enable (even if only partially) readers to reproduce the results of the publication. If you include a publication here, please link to the journal article as well as providing the nbviewer notebook link (and any other relevant resources associated with the paper). + +1. [Reply to 'Influence of cosmic ray variability on the monsoon rainfall and temperature': a false-positive in the field of solar-terrestrial research](http://arxiv.org/abs/1502.00505) by [Benjamin Laken](http://www.benlaken.com), 2015. Reviewed article will appear in JASTP. The [IPython notebook](http://nbviewer.ipython.org/github/benlaken/Comment_BadruddinAslam2014/blob/master/Monsoon_analysis.ipynb) reproduces the full analysis and figures exactly as they appear in the article, and is available on Github: link via [figshare](http://figshare.com/articles/Comment_on_Badruddin_amp_Aslam_2014_/1299413). + +1. [The probability of improvement in Fisher's geometric model: a probabilistic approach](http://dx.doi.org/10.1016/j.tpb.2014.10.004), by [Yoav Ram](http://www.yoavram.com/) and [Lilach Hadany](https://sites.google.com/site/hadanylab/). (Theoretical Population Biology, 2014). An [IPython notebook](http://nbviewer.ipython.org/url/www.sciencedirect.com/science/MiamiMultiMediaURL/1-s2.0-S0040580914000811/1-s2.0-S0040580914000811-mmc1.txt/272364/FULL/S0040580914000811/471cf02085a52c248dc76ae65ad4409d/mmc1.txt), allowing figure reproduction, was deposited as a [supplementry file](http://www.sciencedirect.com/science/MiamiMultiMediaURL/1-s2.0-S0040580914000811/1-s2.0-S0040580914000811-mmc1.txt/272364/FULL/S0040580914000811/471cf02085a52c248dc76ae65ad4409d/mmc1.txt). + +1. [Stress-induced mutagenesis and complex adaptation](http://rspb.royalsocietypublishing.org/content/281/1792/20141025.abstract), by [Yoav Ram](http://www.yoavram.com/) and [Lilach Hadany](https://sites.google.com/site/hadanylab/) (Proceedings B, 2014). An [IPython notebook](https://github.com/yoavram/ruggedsim/blob/master/manuscript/supplementry.ipynb), allowing figures reproduction, was deposited as a [supplementry file](http://rspb.royalsocietypublishing.org/content/suppl/2014/08/19/rspb.2014.1025.DC1). + +1. [Automatic segmentation of odor maps in the mouse olfactory bulb using regularized non-negative matrix factorization](http://www.sciencedirect.com/science/article/pii/S1053811914003103), by J. Soelter et al. (Neuroimage 2014, Open Access). The [notebook](http://nbviewer.ipython.org/github/jansoe/FUImaging/blob/master/examples/IOSsegmentation/regNMF.ipynb) allows to reproduce most figures from the paper and provides a deeper look at the data. The [full code repository](https://github.com/jansoe/FUImaging/tree/Neuroimage2014) is also available. + +1. [Multi-tiered genomic analysis of head and neck cancer ties TP53 mutation to 3p loss, by A. Gross et al. (Nature Genetics 2014)](http://www.nature.com/ng/journal/vaop/ncurrent/full/ng.3051.html). The [full collection of notebooks to replicate the results](https://github.com/theandygross/TCGA/tree/master/Analysis_Notebooks#guide-to-running). + +1. [powerlaw: a Python package for analysis of heavy-tailed distributions, by J. Alstott et al.](https://code.google.com/p/powerlaw/). [Notebook of examples in manuscript](http://nbviewer.ipython.org/gist/19fcdd6a4ba400ce8de2), [ArXiv link](http://arxiv.org/abs/1305.0215) and [project repository](https://github.com/jeffalstott/powerlaw). + +1. [Collaborative cloud-enabled tools allow rapid, reproducible biological insights, by B. Ragan-Kelley et al.](http://www.nature.com/ismej/journal/v7/n3/full/ismej2012123a.html). The [main notebook](http://nbviewer.ipython.org/gist/3693491/cloud_demo_complete.ipynb), the [full collection of related notebooks](http://nbviewer.ipython.org/gist/3693491) and the [companion site](http://qiime.org/home_static/nih-cloud-apr2012) with the Amazon AMI information for reproducing the full paper. + +1. [A Reference-Free Algorithm for Computational Normalization of Shotgun Sequencing Data, by C.T. Brown et al.](http://ged.msu.edu/papers/2012-diginorm). [Full notebook](http://nbviewer.ipython.org/urls/github.com/ged-lab/2012-paper-diginorm/raw/master/notebook/diginorm.ipynb), [ArXiv link](http://arxiv.org/abs/1203.4802) and [project repository](https://github.com/ged-lab/2012-paper-diginorm). + +1. [The kinematics of the Local Group in a cosmological context](http://arxiv.org/abs/1303.2690) by [J.E. Forero-Romero et al.](http://wwwprof.uniandes.edu.co/~je.forero/). The [Full notebook](http://nbviewer.ipython.org/github/forero/LG_Kinematics/blob/master/code/main_analysis.ipynb) and also all the data in a [github repo](https://github.com/forero/LG_Kinematics). + +1. [Warming Ocean Threatens Sea Life](http://www.scientificamerican.com/article.cfm?id=warming-ocean-threatens-sea-life), an article in Scientific American [backed by a notebook for its main plot](http://nbviewer.ipython.org/github/robertodealmeida/notebooks/blob/master/scientific_american/Scientific%20American%20graph.ipynb). By [Roberto de Almeida](https://github.com/robertodealmeida) from [MarinExplore](https://marinexplore.com). + +1. [Extrapolating Weak Selection in Evolutionary Games](http://nbviewer.ipython.org/github/juliangarcia/ews/blob/master/notebook.ipynb), by Wu, García, Hauert and Traulsen. [PLOS Comp Bio paper](http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003381) and [Figshare link](http://figshare.com/articles/Extrapolating_weak_selection_in_evolutionary_games_source_code/814470). + +1. [Using neural networks to estimate redshift distributions. An application to CFHTLenS] +(http://nbviewer.ipython.org/urls/bitbucket.org/christopher_bonnett/nn_notebook/raw/5e69b55193a229cb2076a2f18e43b45c56e317e0/T-800.ipynb) +by Christopher Bonnett [paper](http://arxiv.org/abs/1312.1287)(submitted to MNRAS) + +1. [Mechanisms for stable, robust, and adaptive development of orientation maps in the primary visual cortex](http://dx.doi.org/10.1523/JNEUROSCI.1037-13.2013) by Jean-Luc R. Stevens, Judith S. Law, Jan Antolik, and James A. Bednar. Journal of Neuroscience, 33:15747-15766, 2013. [Notebook1] +(https://ioam.github.io/topographica/_static/gcal_notebook.html), [Notebook2](https://ioam.github.io/topographica/_static/stevens_jn13_notebook.html). + +1. [Accelerated Randomized Benchmarking](http://nbviewer.ipython.org/github/cgranade/accelerated-randomized-benchmarking/blob/master/src/model_testing.ipynb), by [Christopher Granade](http://www.cgranade.com/), [Christopher Ferrie](https://sites.google.com/site/csferrie/) and D. G. Cory. [New Journal of Physics **17** 013042 (2015)](http://iopscience.iop.org/article/10.1088/1367-2630/17/1/013042/meta;jsessionid=1F48CDD7C7D7849B0777C495ED1551CC.c1), [arXiv](http://arxiv.org/abs/1404.5275), [GitHub repo](https://github.com/cgranade/accelerated-randomized-benchmarking). + +1. [Dynamics and associations of microbial community types across the human body](http://dx.doi.org/10.1038/nature13178), by Tao Ding & Patrick D. Schloss. [Notebook replicating results](http://nbviewer.ipython.org/gist/pschloss/9815766/notebook.ipynb). + +1. [Variations in submarine channel sinuosity as a function of latitude and slope](http://nbviewer.ipython.org/gist/zsylvester/6040d0015b9b907bc788), by Sylvester, Z., Pirmez, C., Cantelli, A., & Jobe, Z. R. + +1. [Frontoparietal representations of task context support the flexible control of goal directed cognition](http://www.jneurosci.org/content/34/32/10743.short), by M.L. Waskom, D. Kumaran, A.M. Gordon, J. Rissman, & A.D. Wagner. [Github repository](https://github.com/WagnerLabPapers/Waskom_JNeurosci_2014) | [Main notebook](http://nbviewer.ipython.org/github/WagnerLabPapers/Waskom_JNeurosci_2014/blob/master/Behavioral_and_Decoding_Analyses.ipynb) + +1. [pyparty: Intuitive Particle Processing in Python](http://openresearchsoftware.metajnl.com/article/view/jors.bh), Adam Hughes [Notebook to Generate the Published Figures](http://nbviewer.ipython.org/github/hugadams/pyparty/blob/master/examples/Notebooks/JORS_data.ipynb?create=1) | Also, check out the [pyparty tutorial notebooks](https://github.com/hugadams/pyparty). + +1. [Indication of family-specific DNA methylation patterns in developing oysters](http://biorxiv.org/content/early/2014/12/16/012831), Claire E. Olson, Steven B. Roberts +doi: http://dx.doi.org/10.1101/012831. [Notebook to generate results in the paper](http://nbviewer.ipython.org/github/che625/olson-ms-nb/blob/master/BiGo_dev.ipynb). + +1. [Parallel Prefix Polymorphism Permits Parallelization, Presentation & Proof](http://conferences.computer.org/hptcdl/2014/papers/7020a047.pdf), [Jiahao Chen](http://jiahao.github.io) and [Alan Edelman](http://www-math.mit.edu/~edelman/), HPTCDL'14. [Website](http://jiahao.github.io/parallel-prefix) and [notebook](https://github.com/jiahao/ijulia-notebooks/blob/master/2014-08-06-parallel-prefix.ipynb) + +1. [Transcriptome Sequencing Reveals Potential Mechanism of Cryptic 3’ Splice Site Selection in *SF3B1*-mutated Cancers](http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1004105) by [Christopher DeBoever](http://cdeboever3.github.io/) et al. There are several [notebooks](https://github.com/cdeboever3/deboever-sf3b1-2015/tree/master/notebooks) to replicate results and make figures. + +1. [A Workflow for Characterizing Nanoparticle Monolayers for Biosensors: Machine Learning on Real and Artificial SEM Images](https://peerj.com/preprints/671/), Adam Hughes, Zhaowen Liu, Maryam Raftari, Mark. E Reeves. Notebooks are linked in Table 1 in the text. + +1. [AtomPy: An Open Atomic Data Curation Environment for Astrophysical Applications](http://www.mdpi.com/2218-2004/2/2/123), by C. Mendoza, J. Boswell, D. Ajoku, M. Bautista. + +1. [Visualizing 4-Dimensional Asteroids](http://blogs.scientificamerican.com/sa-visual/2014/09/16/visualizing-4-dimensional-asteroids/), in Scientific American (by Jake VanderPlas) + +1. [Challenges and opportunities in understanding microbial communities with metagenome assembly](http://journal.frontiersin.org/article/10.3389/fmicb.2015.00678/abstract), [accompanied by IPython Notebook tutorial](http://nbviewer.ipython.org/github/germs-lab/frontiers-review-2015/blob/master/frontiers-nb-2015.ipynb), by [Adina Howe](http://germslab.org) and Patrick Chain. + +## Data-driven journalism + +* [The Need for Openness in Data Journalism](http://nbviewer.ipython.org/github/brianckeegan/Bechdel/blob/master/Bechdel_test.ipynb), by [Brian Keegan](http://www.brianckeegan.com). + +* [St. Louis County Segregation Analysis](https://github.com/BuzzFeedNews/2014-08-st-louis-county-segregation) , analysis for the article [The Ferguson Area Is Even More Segregated Than You Probably Guessed](http://www.buzzfeed.com/jsvine/the-ferguson-area-is-even-more-segregated-than-you-thought) by [Jeremy Singer-Vine](https://twitter.com/jsvine). + +## Whimsical notebooks + +* [XKCD-styled plots created with Matplotlib](http://nbviewer.ipython.org/url/jakevdp.github.com/downloads/notebooks/XKCD_plots.ipynb). Here is the [blog post version](http://jakevdp.github.com/blog/2012/10/07/xkcd-style-plots-in-matplotlib/) with discussion. By Jake van der Plas. + +* [Van Gogh's Starry Night with ipythonblocks](http://nbviewer.ipython.org/github/jiffyclub/ipythonblocks/blob/master/demos/starry_night_to_text.ipynb), part of Matt Davis' [ipythonblocks](https://github.com/jiffyclub/ipythonblocks). This is a teaching tool for use with the IPython notebook that provides visual elements to understand programming concepts. + +* [Conway's Game of Life](http://nbviewer.ipython.org/gist/3778422). Interesting use of convolution operation to calculate the next state of game board, instead of obvious find neighbors and filter the board for next state. + +* [pynguins](https://gist.github.com/denfromufa/9a5e1fdeaf611dc60ea8). Using jupyter notebook, python, and numpy to solve Board Game "Penguins on Ice". + +* ["People plots"](http://nbviewer.ipython.org/gist/4544012), stick figures generated with matplotlib. + +* [Reveal converter mini-tutorial](http://nbviewer.ipython.org/url/www.damian.oquanta.info/posts/reveal-converter-mini-tutorial.ipynb), also available in [blog post form](http://www.damian.oquanta.info/posts/reveal-converter-mini-tutorial.html). Do you want to make static html/css slideshow straight from the IPython notebook? OK, now you can do it with the *reveal* converter (nbconvert). [Demo](http://www.slideviper.oquanta.info/tutorial/slideshow_tutorial_slides.html) by Damián Avila. + +* [Personal IPython Weight Notebook] (http://nbviewer.ipython.org/gist/9769238). Plot your loss of weight with prognosis and motivation features. + +* [Streaming Double Pendulum Simulation in IPython NB](http://nbviewer.ipython.org/github/plotly/python-user-guide/blob/master/s7_streaming/s7_streaming.ipynb). + +* [Porque Charles Xavier debe cambiar a Cerebro por Python](http://nbviewer.ipython.org/github/mshopper/aurora/blob/master/Aurora.ipynb), a study in data and gender in the Marvel comics universe, by [Mai Giménez](http://twitter.com/adahopper) and [Angela Rivera](http://twitter.com/ghilbrae). + +* [Functional Geometry: a deconstruction of the MC Escher woodcut Square Limit](http://nbviewer.ipython.org/github/shashi/ijulia-notebooks/blob/master/funcgeo/Functional%20Geometry.ipynb), an IJulia notebook by [Shashi Gowda](http://shashi.github.io/). + +## Videos of IPython being used in the wild + +Of course the first thing you might try is searching for videos about IPython (1900 or so by last count on Youtube) but there are demonstrations of other applications using the power of IPython but are not mentioned is the descriptions. Below are a few such: + +* [Video](http://www.youtube.com/watch?v=Nc16qeGBtMU) on how to learn Python featuring IPython as the platform of choice for learning! + +* [This video](http://www.youtube.com/watch?v=4ONBVNm3isI) shows IPython being used in the [scikit-learn](http://scikit-learn.org) project + +* He doesn't show IPython in use but his IPython sticker is clear for the entire video: [Planning and Tending the Garden: The Future of Early Childhood Python Education](http://www.youtube.com/watch?v=op61s-QHryk) + +* [Wes McKinney's speech](http://youtu.be/qbYYamU42Sw?t=5m9s) on Python and data analysis features IPython as does his book [Python for Data Analysis](http://shop.oreilly.com/product/0636920023784.do) + +* [This video](https://www.youtube.com/watch?v=zG8FYPFU9n4) shows Plotly and IPython in use at a Montreal Python meetup. diff --git a/userguide/zh/notebook/nb.md b/userguide/zh/notebook/nb.md new file mode 100644 index 0000000..51e8cc9 --- /dev/null +++ b/userguide/zh/notebook/nb.md @@ -0,0 +1,104 @@ +## 使用 Notebook + +进入 Dashboard , 选择 ```New - Python3```, 创建 Python3 Notebook, 会 +进入 Python3 Notebook 交互编程环境,并生成一个 ```untitled.ipynb``` 的 +IPython Notebook文件。 + +本节介绍了一个基本的[Hello World](#hello-world), 一个[机器学习并可视化结 +果](#一个机器学习并可视化结果的示例), 一个[完整教程](#完整教程) 的链接。 + +### Hello World ### + +在 Cell 中输入 + +```python +print("Hello World") +``` + +执行后得到输出 + +``` +Hello World +``` + + +### 一个机器学习并可视化结果的示例 + +绘图需要加载 matplotlib. 在 Cell 中输入并执行 + +```python +%matplotlib inline +``` + +或者 + +```python +%matplotlib notebook +``` + +即可开启在浏览器中绘图的支持。 + +在后面的 Cell 中输入并执行下面关于机器学习的代码 + +```python +# import package +import numpy as np +import matplotlib.pyplot as plt +from sklearn import linear_model, datasets + +# load data : we only use target==0 and target==1 (2 types classify) and feature 0 and feature 2 () +iris = datasets.load_iris() +X = iris.data[iris.target!=2][:, [0,2]] +Y = iris.target[iris.target!=2] + +h = .02 # step size in the mesh + +logreg = linear_model.LogisticRegression(C=1e5) +logreg.fit(X, Y) + +# Plot the decision boundary. For that, we will assign a color to each +# point in the mesh [x_min, m_max]x[y_min, y_max]. +x_min, x_max = X[:, 0].min() - .5, X[:, 0].max() + .5 +y_min, y_max = X[:, 1].min() - .5, X[:, 1].max() + .5 +xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h)) +Z = logreg.predict(np.c_[xx.ravel(), yy.ravel()]) + +# Put the result into a color plot +Z = Z.reshape(xx.shape) +#plt.figure(1, figsize=(4, 3)) +plt.pcolormesh(xx, yy, Z, cmap=plt.cm.Paired) +plt.xlabel('Sepal length') +plt.ylabel('Sepal width') + +# Plot also the training points +plt.scatter(X[:, 0], X[:, 1], c=Y, edgecolors='k', cmap=plt.cm.Paired) +plt.xlabel('Sepal length') +plt.ylabel('Sepal width') + +plt.xlim(xx.min(), xx.max()) +plt.ylim(yy.min(), yy.max()) +plt.xticks(()) +plt.yticks(()) + +plt.savefig("learn.svg") +``` + +稍后可看到如下输出 + + + +当前目录下同时会生成矢量图形文件 ```learn.svg```,可以在 Dashboard 中打 +开, 也可以直接在 Cell 中打开 + +``` +SVG("learn.svg") +``` + + +### 完整教程 ### + +在IPython Notebook中 能做很多事情,如显示本地磁盘中的各种图形文件、视频 +文件、HTML文件,加载一个外部网站,显示 LaTeX 公式等,完整介绍和教程请见 +[这里](http://nbviewer.jupyter.org/github/ipython/ipython/blob/4.0.x/examples/IPython%20Kernel/Index.ipynb). + +[nbviewer](http://nbviewer.jupyter.org) 同样提供了很多其他应用示例。 diff --git a/userguide/zh/notebook/python.md b/userguide/zh/notebook/python.md new file mode 100644 index 0000000..f2201a5 --- /dev/null +++ b/userguide/zh/notebook/python.md @@ -0,0 +1,821 @@ +## Python 软件包 ## + +Docklet 基础镜像中 Python3 相关的软件包列表: + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Python SoftwareLicenseSummaryNote
+ basemap + PSFPlot data on map projections with Matplotlib + Map +
+ bcolz + BSDColumnar and compressed data containers. + Data +
+ bcrypt + ApacheModern password hashing for your software and your servers + Crypt +
+ BeautifulSoup4 + MITSits atop an HTML or XML parser, providing Pythonic idioms for iterating, searching, and modifying the parse tree. + Web +
+ biopython + (as-is) BSD-likeTools for biological computation + Biological +
+ blaze + BSDData migration utilities + Data +
+ bokeh + BSDStatistical and novel interactive HTML plots for Python + Plot +
+ cartopy + LGPLA Python package designed to make drawing maps for data analysis and visualisation as easy as possible + Map +
+ chaco + BSDLibrary for generating interactive 2D plots + Plot +
+ cheetah + MITTemplate engine and code generation tool + Web +
+ coverage + BSDCode coverage measurement for Python + Dev +
+ cryptography + Apache 2.0cryptography is a package which provides cryptographic recipes and primitives to Python developers. + Crypt +
+ Cython + ApachePython-based language for writing C extensions for Python + Dev +
+ docutils + public-domain with exceptions (BSD, PSF)Documentation utilities + Doc +
+ flask + BSDA lightweight web application framework based on Werkzeug, Jinja2 and good intentions. + Web +
+ Flask-Babel + BSDAdds i18n/l10n support to Flask applications + Web +
+ gevent + MITPython network library using greenlet and libevent for easy and scalable concurrency + Sys +
+ gevent_websocket + Apache 2.0Websocket handler for the gevent pywsgi server, a Python network library + Network +
+ ggplot + BSDggplot for python + +
+ h5py + BSDPython interface to the HDF library + Data +
+ Jinja2 + BSDTemplate engine + Web +
+ jupyter + BSDJupyter metapackage. Install all the Jupyter components in one go. + Web +
+ lmfit + BSDLeast-Squares Minimization with Bounds and Constraints + Scipy +
+ lxml + BSDXML/XSLT library with bindings to libxml2/libxslt + Doc +
+ M2Crypto + MITM2Crypto: A Python crypto and SSL toolkit + Crypt +
+ matplotlib + PSFInteractive 2D plotting library + Plot +
+ MDP + BSDModular toolkit for Data Processing (MDP) + Data +
+ memory_profiler + BSDA module for monitoring memory usage of a python program + Dev +
+ msgpack + Apache 2.0Efficient binary serialization library + Dev +
+ multipledispatch + BSDMultiple dispatch in Python. + Dev +
+ networkx + BSDCreate, manipulate, and analyze graphs and networks + Graph +
+ nltk + Apache 2.0Suite of tools for symbolic and statistical natural language processing + NLP +
+ numexpr + MITFast evaluation of array expressions + Math +
+ numpy + BSDGeneral-purpose multi-dimensional array-processing and math library + Math +
+ openpyxl + MIT/ExpatRead and and write Excel OpenXML files + Excel +
+ pandas + BSDData manipulation and analysis library + Data +
+ paste + MITTools for using a Web Server Gateway Interface stack + Web +
+ patsy + BSDPackage for describing statistical models and building design matrices + Data +
+ pep8 + BSDPython style guide checker + Dev +
+ pexpect + ISC licensePexpect allows easy control of interactive console applications. + Sys +
+ PIL + MITImage processing library + Image +
+ plotly + MITPython plotting library for collaborative, interactive, publication-quality graphs. + Plot +
+ ply + BSDPython implementation of lex and yacc + Dev +
+ prettyplotlib + MITPainlessly create beautiful default matplotlib plots. + Plot +
+ psutil + BSDOS interface to processes and tasks + Sys +
+ py + MITLibrary with cross-python path, ini-parsing, io, code, log facilities + Sys +
+ pyephem + LGPLScientific-grade astronomy routines + Astronomy +
+ pyflakes + MITStatic analysis of Python code to detect common errors + Dev +
+ Pygments + BSDCode syntax highlighting package written in Python + Dev +
+ pyOpenSSL + Apache License 2Python bindings to the OpenSSL library + Network +
+ pysal + BSDA library of spatial analysis functions. + Scipy +
+ PyTables + BSDHierarchical datasets for extremely large data + Data +
+ pytest + MITSimple powerful testing with Python + Dev +
+ PyYAML + MITYAML parser and emitter + Doc +
+ pyzmq + BSD-likeBinding to the ZeroMQ fast messaging library + Middleware +
+ requests + Apache 2.0Elegant and simple HTTP library for Python + Web +
+ scikit_learn + BSDMachine learning and data mining routines + ML +
+ SciPy + BSDLibraries for mathematics, science, and engineering + Scipy +
+ SCons + MITPythonic substitute for Make + Dev +
+ seaborn + BSDSeaborn: statistical data visualization + Data +
+ setuptools + PSFEasily download, build, install, upgrade, and uninstall Python packages + Sys +
+ Shapely + BSDGeometric objects, predicates, and operations + Geo +
+ simplegeneric + ZPL 2.1Simple generic functions (similar to Python's own len(), pickle.dump(), etc.) + Dev +
+ Sphinx + BSDCreates intelligent and beautiful project documentation + Doc +
+ SQLAlchemy + MITSQL toolkit and Object Relational Mapper + Data +
+ statsmodels + BSDStatistical computations and models for use with SciPy + Scipy +
+ SymPy + BSDSymbolic mathematics library + Math +
+ Tornado + Apache 2.0Scalable, non-blocking web server + Web +
+ virtualenv + MITVirtual Python Environment builder + Sys +
+ werkzeug + BSDAdvanced WSGI utility modules + Web +
+ xarray + ApacheN-D labeled arrays and datasets in Python + Math +
+ xlrd + BSDExtract data from Microsoft Excel (tm) spreadsheet files + Excel +
+ xlsxwriter + BSDA Python module for creating Excel XLSX files. + Excel +
+ + diff --git a/userguide/zh/notebook/r.md b/userguide/zh/notebook/r.md new file mode 100644 index 0000000..4954367 --- /dev/null +++ b/userguide/zh/notebook/r.md @@ -0,0 +1,1844 @@ +## R 软件包 ## + +### 推荐的 R 包 ### + +RStudio 网站的文章 [Quick list of useful R packages](https://support.rstudio.com/hc/en-us/articles/201057987-Quick-list-of-useful-R-packages)给出了一个推荐的R 包列表。 + +

Many useful R function come in packages, free libraries of code written by R's active user community. To install an R package, open an R session and type at the command line

+
install.packages("<the package's name>")
+
+

R will download the package from CRAN, so you'll need to be connected to the internet. Once you have a package installed, you can make its contents available to use in your current R session by running

+
library("<the package's name>")
+
+

There are thousands of helpful R packages for you to use, but navigating them all can be a challenge. To help you out, we've compiled this guide to some of the best. We've used each of these, and found them to be outstanding – we've even written some of them. But you don't have to take our word for it, these packages are also some of the top most downloaded R packages.

+

To load data

+

RODBC, RMySQL, RPostgresSQL, RSQLite - If you'd like to read in data from a database, these packages are a good place to start. Choose the package that fits your type of database.

+

XLConnect, xlsx - These packages help you read and write Micorsoft Excel files from R. You can also just export your spreadsheets from Excel as .csv's.

+

foreign - Want to read a SAS data set into R? Or an SPSS data set? Foreign provides functions that help you load data files from other programs into R.

+

R can handle plain text files – no package required. Just use the functions read.csv, read.table, and read.fwf. If you have even more exotic data, consult the CRAN guide to data import and export.

+ +

To manipulate data

+ +

dplyr - Essential shortcuts for subsetting, summarizing, rearranging, and joining together data sets. dplyr is our go to package for fast data manipulation.

+

tidyr - Tools for changing the layout of your data sets. Use the gather and spread functions to convert your data into the tidy format, the layout R likes best.

+

stringr - Easy to learn tools for regular expressions and character strings.

+

lubridate - Tools that make working with dates and times easier.

+ +

To visualize data

+ +

ggplot2 - R's famous package for making beautiful graphics. ggplot2 lets you use the grammar of graphics to build layered, customizable plots.

+

ggvis - Interactive, web based graphics built with the grammar of graphics.

+

rgl - Interactive 3D visualizations with R

+

htmlwidgets - A fast way to build interactive (javascript based) visualizations with R. Packages that implement htmlwidgets include:

+ +

 

+

googleVis - Let's you use Google Chart tools to visualize data in R. Google Chart tools used to be called Gapminder, the graphing software Hans Rosling made famous in hie TED talk.

+ +

To model data

+ +

car - car's Anova function is popular for making type II and type III Anova tables.

+

mgcv - Generalized Additive Models

+

lme4/nlme - Linear and Non-linear mixed effects models

+

randomForest - Random forest methods from machine learning

+

multcomp - Tools for multiple comparison testing

+

vcd - Visualization tools and tests for categorical data

+

glmnet - Lasso and elastic-net regression methods with cross validation

+

survival - Tools for survival analysis

+

caret - Tools for training regression and classification models

+ +

To report results

+ +

shiny - Easily make interactive, web apps with R. A perfect way to explore data and share findings with non-programmers.

+

R Markdown - The perfect workflow for reproducible reporting. Write R code in your markdown reports. When you run render, R Markdown will replace the code with its results and then export your report as an HTML, pdf, or MS Word document, or a HTML or pdf slideshow. The result? Automated reporting. R Markdown is integrated straight into RStudio.

+

xtable - The xtable function takes an R object (like a data frame) and returns the latex or HTML code you need to paste a pretty version of the object into your documents. Copy and paste, or pair up with R Markdown.

+ +

For Spatial data

+ +

sp, maptools - Tools for loading and using spatial data including shapefiles.

+

maps - Easy to use map polygons for plots.

+

ggmap - Download street maps straight from Google maps and use them as a background in your ggplots.

+ +

For Time Series and Financial data

+ +

zoo - Provides the most popular format for saving time series objects in R.

+

xts - Very flexible tools for manipulating time series data sets.

+

quantmod - Tools for downloading financial data, plotting common charts, and doing technical analysis.

+ +

To write high performance R code

+ +

Rcpp - Write R functions that call C++ code for lightning fast speed.

+

data.table - An alternative way to organize data sets for very, very fast operations. Useful for big data.

+

parallel - Use parallel processing in R to speed up your code or to crunch large data sets.

+ +

To work with the web

+ +

XML - Read and create XML documents with R

+

jsonlite - Read and create JSON data tables with R

+

httr - A set of useful tools for working with http connections

+ +

To write your own R packages

+ +

devtools - An essential suite of tools for turning your code into an R package.

+

testthat - testthat provides an easy way to write unit tests for your code projects.

+

roxygen2 - A quick way to document your R packages. roxygen2 turns inline code comments into documentation pages and builds a package namespace.

+

You can also read about the entire package development process online in Hadley Wickham's R Packages book

+ + +### Top 100 R packages for 2013 (Jan-May) + +网站 http://www.r-statistics.com 2013年的[这篇文章](http://www.r-statistics.com/2013/06/top-100-r-packages-for-2013-jan-may/) 统计分析了 R 包的下载情况
+


+
+

Package

+
+

Title

+
+

Downloads

+
+

1

+
+

plyr

+
+

Tools for splitting, applying and combining data

+
+

84049

+
+

2

+
+

digest

+
+

Create cryptographic hash digests of R objects

+
+

83192

+
+

3

+
+

ggplot2

+
+

An implementation of the Grammar of Graphics

+
+

82768

+
+

4

+
+

colorspace

+
+

Color Space Manipulation

+
+

81901

+
+

5

+
+

stringr

+
+

Make it easier to work with strings

+
+

77658

+
+

6

+
+

RColorBrewer

+
+

ColorBrewer palettes

+
+

66783

+
+

7

+
+

reshape2

+
+

Flexibly reshape data: a reboot of the reshape package

+
+

64911

+
+

8

+
+

zoo

+
+

S3 Infrastructure for Regular and Irregular Time Series (Z’s

+

ordered observations)

+
+

60844

+
+

9

+
+

proto

+
+

Prototype object-based programming

+
+

59043

+
+

10

+
+

scales

+
+

Scale functions for graphics

+
+

58369

+
+

11

+
+

car

+
+

Companion to Applied Regression

+
+

57453

+
+

12

+
+

dichromat

+
+

Color Schemes for Dichromats

+
+

56624

+
+

13

+
+

gtable

+
+

Arrange grobs in tables

+
+

54431

+
+

14

+
+

munsell

+
+

Munsell colour system

+
+

53183

+
+

15

+
+

labeling

+
+

Axis Labeling

+
+

51877

+
+

16

+
+

Hmisc

+
+

Harrell Miscellaneous

+
+

47836

+
+

17

+
+

rJava

+
+

Low-level R to Java interface

+
+

47731

+
+

18

+
+

mvtnorm

+
+

Multivariate Normal and t Distributions

+
+

46884

+
+

19

+
+

bitops

+
+

Bitwise Operations

+
+

45689

+
+

20

+
+

rgl

+
+

3D visualization device system (OpenGL)

+
+

41001

+
+

21

+
+

foreign

+
+

Read Data Stored by Minitab, S, SAS, SPSS, Stata, Systat, dBase,

+

..

+
+

37849

+
+

22

+
+

XML

+
+

Tools for parsing and generating XML within R and S-Plus

+
+

37153

+
+

23

+
+

lattice

+
+

Lattice Graphics

+
+

36597

+
+

24

+
+

e1071

+
+

Misc Functions of the Department of Statistics (e1071), TU Wien

+
+

35180

+
+

25

+
+

gtools

+
+

Various R programming tools

+
+

35028

+
+

26

+
+

sp

+
+

classes and methods for spatial data

+
+

34786

+
+

27

+
+

gdata

+
+

Various R programming tools for data manipulation

+
+

34262

+
+

28

+
+

Rcpp

+
+

Seamless R and C++ Integration

+
+

33929

+
+

29

+
+

MASS

+
+

Support Functions and Datasets for Venables and Ripley’s MASS

+
+

33667

+
+

30

+
+

Matrix

+
+

Sparse and Dense Matrix Classes and Methods

+
+

30740

+
+

31

+
+

lmtest

+
+

Testing Linear Regression Models

+
+

30319

+
+

32

+
+

survival

+
+

Survival Analysis

+
+

30186

+
+

33

+
+

caTools

+
+

Tools: moving window statistics, GIF, Base64, ROC AUC, etc

+
+

29945

+
+

34

+
+

multcomp

+
+

Simultaneous Inference in General Parametric Models

+
+

29871

+
+

35

+
+

RCurl

+
+

General network (HTTP/FTP/…) client interface for R

+
+

28866

+
+

36

+
+

knitr

+
+

A general-purpose package for dynamic report generation in R

+
+

28104

+
+

37

+
+

xtable

+
+

Export tables to LaTeX or HTML

+
+

28091

+
+

38

+
+

xts

+
+

eXtensible Time Series

+
+

28058

+
+

39

+
+

rpart

+
+

Recursive Partitioning

+
+

27812

+
+

40

+
+

evaluate

+
+

Parsing and evaluation tools that provide more details than the

+

default

+
+

27617

+
+

41

+
+

RODBC

+
+

ODBC Database Access

+
+

26131

+
+

42

+
+

quadprog

+
+

Functions to solve Quadratic Programming Problems

+
+

25433

+
+

43

+
+

tseries

+
+

Time series analysis and computational finance

+
+

25144

+
+

44

+
+

DBI

+
+

R Database Interface

+
+

24793

+
+

45

+
+

nlme

+
+

Linear and Nonlinear Mixed Effects Models

+
+

24360

+
+

46

+
+

lme4

+
+

Linear mixed-effects models using S4 classes

+
+

24199

+
+

47

+
+

reshape

+
+

Flexibly reshape data

+
+

24118

+
+

48

+
+

sandwich

+
+

Robust Covariance Matrix Estimators

+
+

24016

+
+

49

+
+

leaps

+
+

regression subset selection

+
+

23666

+
+

50

+
+

gplots

+
+

Various R programming tools for plotting data

+
+

23251

+
+

51

+
+

abind

+
+

Combine multi-dimensional arrays

+
+

22758

+
+

52

+
+

randomForest

+
+

Breiman and Cutler’s random forests for classification and

+

regression

+
+

22401

+
+

53

+
+

Rcmdr

+
+

R Commander

+
+

22131

+
+

54

+
+

coda

+
+

Output analysis and diagnostics for MCMC

+
+

21900

+
+

55

+
+

maps

+
+

Draw Geographical Maps

+
+

21550

+
+

56

+
+

igraph

+
+

Network analysis and visualization

+
+

21423

+
+

57

+
+

formatR

+
+

Format R Code Automatically

+
+

21049

+
+

58

+
+

maptools

+
+

Tools for reading and handling spatial objects

+
+

20957

+
+

59

+
+

RSQLite

+
+

SQLite interface for R

+
+

19671

+
+

60

+
+

psych

+
+

Procedures for Psychological, Psychometric, and Personality

+

Research

+
+

19545

+
+

61

+
+

KernSmooth

+
+

Functions for kernel smoothing for Wand &amp Jones (1995)

+
+

19166

+
+

62

+
+

rgdal

+
+

Bindings for the Geospatial Data Abstraction Library

+
+

19064

+
+

63

+
+

RcppArmadillo

+
+

Rcpp integration for Armadillo templated linear algebra library

+
+

18899

+
+

64

+
+

effects

+
+

Effect Displays for Linear, Generalized Linear,

+

Multinomial-Logit, Proportional-Odds Logit Models and

+

Mixed-Effects Models

+
+

18843

+
+

65

+
+

sem

+
+

Structural Equation Models

+
+

18711

+
+

66

+
+

vcd

+
+

Visualizing Categorical Data

+
+

18589

+
+

67

+
+

XLConnect

+
+

Excel Connector for R

+
+

18230

+
+

68

+
+

markdown

+
+

Markdown rendering for R

+
+

18211

+
+

69

+
+

timeSeries

+
+

Rmetrics – Financial Time Series Objects

+
+

17932

+
+

70

+
+

timeDate

+
+

Rmetrics – Chronological and Calendar Objects

+
+

17838

+
+

71

+
+

RJSONIO

+
+

Serialize R objects to JSON, JavaScript Object Notation

+
+

17801

+
+

72

+
+

cluster

+
+

Cluster Analysis Extended Rousseeuw et al

+
+

17136

+
+

73

+
+

scatterplot3d

+
+

3D Scatter Plot

+
+

17110

+
+

74

+
+

nnet

+
+

Feed-forward Neural Networks and Multinomial Log-Linear Models

+
+

17074

+
+

75

+
+

fBasics

+
+

Rmetrics – Markets and Basic Statistics

+
+

16278

+
+

76

+
+

forecast

+
+

Forecasting functions for time series and linear models

+
+

15638

+
+

77

+
+

quantreg

+
+

Quantile Regression

+
+

15509

+
+

78

+
+

foreach

+
+

Foreach looping construct for R

+
+

15405

+
+

79

+
+

chron

+
+

Chronological objects which can handle dates and times

+
+

15226

+
+

80

+
+

plotrix

+
+

Various plotting functions

+
+

15142

+
+

81

+
+

matrixcalc

+
+

Collection of functions for matrix calculations

+
+

15107

+
+

82

+
+

aplpack

+
+

Another Plot PACKage: stem.leaf, bagplot, faces, spin3R, and

+

some slider functions

+
+

14654

+
+

83

+
+

strucchange

+
+

Testing, Monitoring, and Dating Structural Changes

+
+

14503

+
+

84

+
+

iterators

+
+

Iterator construct for R

+
+

14449

+
+

85

+
+

mgcv

+
+

Mixed GAM Computation Vehicle with GCV/AIC/REML smoothness

+

estimation

+
+

14186

+
+

86

+
+

kernlab

+
+

Kernel-based Machine Learning Lab

+
+

14135

+
+

87

+
+

SparseM

+
+

Sparse Linear Algebra

+
+

13921

+
+

88

+
+

tree

+
+

Classification and regression trees

+
+

13871

+
+

89

+
+

robustbase

+
+

Basic Robust Statistics

+
+

13778

+
+

90

+
+

vegan

+
+

Community Ecology Package

+
+

13686

+
+

91

+
+

devtools

+
+

Tools to make developing R code easier

+
+

13488

+
+

92

+
+

latticeExtra

+
+

Extra Graphical Utilities Based on Lattice

+
+

13253

+
+

93

+
+

modeltools

+
+

Tools and Classes for Statistical Models

+
+

13233

+
+

94

+
+

xlsx

+
+

Read, write, format Excel 2007 and Excel 97/2000/XP/2003 files

+
+

13097

+
+

95

+
+

slam

+
+

Sparse Lightweight Arrays and Matrices

+
+

13060

+
+

96

+
+

TTR

+
+

Technical Trading Rules

+
+

12894

+
+

97

+
+

quantmod

+
+

Quantitative Financial Modelling Framework

+
+

12892

+
+

98

+
+

relimp

+
+

Relative Contribution of Effects in a Regression Model

+
+

12692

+
+

99

+
+

akima

+
+

Interpolation of irregularly spaced data

+
+

12680

+
+

100

+
+

memoise

+
+

Memoise functions

+
+

12600

+
+ +### 可用 R 包 列表 ### + +Docklet 基础镜像提供了Ubuntu源中的 `r-base`,`r-recommended`,`r-cran-*`即所有的R包。 + +包括: + + * r-base - GNU R statistical computation and graphics system + * r-base-core - GNU R core of statistical computation and graphics system + * r-base-core-dbg - GNU R debug symbols for statistical comp. language and environment + * r-base-dev - GNU R installation of auxiliary GNU R packages + * r-base-html - GNU R html docs for statistical computing system functions + * r-bioc-qvalue - GNU R package for Q-value estimation for FDR control + * r-cran-abind - GNU R abind multi-dimensional array combination function + * r-cran-acepack - GNU R package for regression transformations + * r-cran-ade4 - GNU R analysis of ecological data + * r-cran-afex - GNU R package for analyzing factorial experiments using ANOVA or mixed models + * r-cran-amelia - GNU R package supporting multiple imputation of missing data + * r-cran-amore - GNU R: A MORE flexible neural network package + * r-cran-ape - GNU R package for Analyses of Phylogenetics and Evolution + * r-cran-base64enc - GNU R package that provides tools for base64 encoding + * r-cran-batchjobs - GNU R batch computing + * r-cran-bayesfactor - GNU R package providing Bayes factors for t-tests, ANOVAs and contingency tables + * r-cran-bayesm - GNU R package for Bayesian inference + * r-cran-bbmisc - GNU R Miscellaneous helper functions for B. Bischl + * r-cran-beeswarm - bee swarm plot, an alternative to stripchart + * r-cran-biasedurn - GNU R Biased Urn model distributions + * r-cran-bio3d - GNU R package for biological structure analysis + * r-cran-bitops - GNU R package implementing bitwise operations + * r-cran-blockmodeling - Generalized and classical blockmodeling of valued networks + * r-cran-bms - GNU R package for Bayesian model averaging for linear models + * r-cran-boolnet - assembling, analyzing and visualizing Boolean networks + * r-cran-boot - GNU R package for bootstrapping functions from Davison and Hinkley + * r-cran-bradleyterry2 - GNU R package for using Bradley-Terry models + * r-cran-brew - GNU R templating framework for report generation + * r-cran-brglm - GNU R package for bias reduction in binomial-response GLMs + * r-cran-cairodevice - GNU R Cairo/Gtk2 device driver package + * r-cran-car - GNU R Companion to Applied Regression by John Fox + * r-cran-caret - GNU R package for classification and regression training + * r-cran-catools - GNU R package providing various utility functions + * r-cran-checkmate - GNU R fast and versatile argument checks + * r-cran-chron - GNU R package for chronologically ordered objects + * r-cran-class - GNU R package for classification + * r-cran-cluster - GNU R package for cluster analysis by Rousseeuw et al + * r-cran-cmprsk - GNU R subdistribution analysis of competing risks + * r-cran-coda - Output analysis and diagnostics for MCMC simulations in R + * r-cran-codetools - GNU R package providing code analysis tools + * r-cran-coin - GNU R package providing conditional inference procedures + * r-cran-colorspace - GNU R Color Space Manipulation + * r-cran-combinat - GNU R package with utilities for combinatorics + * r-cran-contfrac - GNU R package providing various utilities for evaluating continued fractions + * r-cran-conting - GNU R package for Bayesian analysis of contingency tables + * r-cran-crayon - GNU R colored terminal output + * r-cran-cubature - GNU R package for adaptive multivariate integration + * r-cran-date - GNU R package for date handling + * r-cran-dbi - GNU R package providing a generic database interface + * r-cran-deal - Learning Bayesian Networks with Mixed Variables + * r-cran-deldir - GNU R Delaunay Triangulation and Dirichlet (Voronoi) Tessellation + * r-cran-deoptimr - GNU R package for Diffential Evolution in pure R + * r-cran-desolve - GNU R package providing functions that solve initial value problems + * r-cran-diagnosismed - medical diagnostic test accuracy analysis toolkit + * r-cran-dichromat - Color schemes for dichromats + * r-cran-digest - GNU R package for 'hash digest' of R data structures + * r-cran-domc - GNU R parallel excution backend for %dopar% using multicore + * r-cran-doparallel - GNU R foreach parallel adaptor for the parallel package + * r-cran-dosefinding - Planning and Analyzing Dose Finding experiments + * r-cran-dosnow - GNU R parallel excution backend for %dopar% using snow + * r-cran-downloader - GNU R package for downloading files over http and https + * r-cran-e1071 - GNU R package with miscellaneous functions of the Dept of Statisics (e1071) + * r-cran-eco - GNU R routines for Bayesian ecological inference + * r-cran-ecodist - GNU R package for dissimilarity-based ecological analysis + * r-cran-effects - GNU R graphical and tabular effects display for glm models + * r-cran-elliptic - GNU R package providing elliptic and related functions + * r-cran-energy - GNU R package for energy statistics for distribution comparison + * r-cran-epi - GNU R epidemiological analysis + * r-cran-epibasix - GNU R Elementary Epidemiological Functions + * r-cran-epicalc - GNU R Epidemiological calculator + * r-cran-epir - GNU R Functions for analysing epidemiological data + * r-cran-epitools - GNU R Epidemiology Tools for Data and Graphics + * r-cran-erm - GNU R package for 'extended Rasch modelling' + * r-cran-estimability - GNU R package providing tools for determining estimability of linear functions + * r-cran-etm - GNU R empirical transition matrix + * r-cran-evaluate - GNU R parsing and evaluation tools + * r-cran-evd - GNU R Functions for extreme value distributions + * r-cran-expm - GNU R Computation of the matrix exponential and related quantities + * r-cran-fail - GNU R File Abstraction Interface Layer (FAIL) mimicking a key-value store + * r-cran-fasianoptions - GNU R package for financial engineering -- fAsianOptions + * r-cran-fassets - GNU R package for financial engineering -- fAssets + * r-cran-fastcluster - Fast hierarchical clustering routines for GNU R + * r-cran-fastmatch - GNU R package for fast match replacement for repeated look-ups + * r-cran-fbasics - GNU R package for financial engineering -- fBasics + * r-cran-fbonds - GNU R package for financial engineering -- fBonds + * r-cran-fcopulae - GNU R package for financial engineering -- fCopulae + * r-cran-fexoticoptions - GNU R package for financial engineering -- fExoticOptions + * r-cran-fextremes - GNU R package for financial engineering -- fExtremes + * r-cran-fgarch - GNU R package for financial engineering -- fGarch + * r-cran-fimport - GNU R package for financial engineering -- fImport + * r-cran-fmultivar - GNU R package for financial engineering -- fMultivar + * r-cran-fnonlinear - GNU R package for financial engineering -- fNonlinear + * r-cran-foptions - GNU R package for financial engineering -- fOptions + * r-cran-foreach - GNU R foreach looping support + * r-cran-foreign - GNU R package to read/write data from other stat. systems + * r-cran-formatr - Format R code automatically + * r-cran-formula - GNU R package for extended model formulas + * r-cran-fportfolio - GNU R package for financial engineering -- fPortfolio + * r-cran-fregression - GNU R package for financial engineering -- fRegression + * r-cran-ftrading - GNU R package for financial engineering -- fTrading + * r-cran-funitroots - GNU R package for financial engineering -- fUnitRoots + * r-cran-futile.logger - logging utility for GNU R + * r-cran-futile.options - GNU R futile options management + * r-cran-g.data - GNU R package for delayed-data + * r-cran-gam - Generalized Additive Models for R + * r-cran-gbm - GNU R package "Generalized Boosted Regression Models" + * r-cran-gdata - GNU R package with data manipulation tools by Greg Warnes et al + * r-cran-geepack - Generalized Estimating Equation Package for R + * r-cran-genabel - GNU R package for genome-wide SNP association analysis + * r-cran-genabel.data - data package for genome-wide SNP association analysis + * r-cran-genetics - GNU R package for population genetics + * r-cran-getopt - GNU R package providing command-line parsing functionality + * r-cran-ggplot2 - implementation of the Grammar of Graphics + * r-cran-gmaps - GNU R support for producing geographic maps with grid graphics + * r-cran-gmodels - GNU R package with tools for model fitting by Greg Warnes et al + * r-cran-gnm - GNU R package for generalized nonlinear models + * r-cran-gplots - GNU R package with tools for plotting data by Greg Warnes et al + * r-cran-gregmisc - GNU R package with miscellaneous functions by Greg Warnes et al + * r-cran-gridextra - GNU R package with extensions for the grid package + * r-cran-gsl - GNU R wrapper for the GNU Scientific Library + * r-cran-gss - GNU R package for multivariate estimation using smoothing splines + * r-cran-gtable - Arrange grobs in tables + * r-cran-gtools - GNU R package with R programming tools by Greg Warnes et al + * r-cran-haplo.stats - GNU R package for haplotype analysis + * r-cran-hdf5 - GNU R package interfacing the NCSA HDF5 library + * r-cran-hmisc - GNU R miscellaneous functions by Frank Harrell + * r-cran-hwriter - HTML Writer - Outputs R objects in HTML format + * r-cran-hypergeo - GNU R package providing the Gaussian hypergeometric for complex numbers + * r-cran-igraph - GNU R network analysis and visualization + * r-cran-inline - GNU R package to inline C, C++, Fortran functions from R + * r-cran-int64 - GNU R package for 64 bit integer types + * r-cran-iterators - GNU R iterator support for vectors, lists and other containers + * r-cran-its - GNU R package for handling irregular time series + * r-cran-jsonlite - Robust, High Performance JSON Parser and Generator for R + * r-cran-kernlab - GNU R package for kernel-based machine learning lab + * r-cran-kernsmooth - GNU R package for kernel smoothing and density estimation + * r-cran-labeling - GNU R Axis Labeling optimization + * r-cran-lambda.r - GNU R modeling data with functional programming + * r-cran-lattice - GNU R package for 'Trellis' graphics + * r-cran-latticeextra - GNU R package of additional graphical displays based on lattice + * r-cran-learnbayes - GNU R functions for learning bayesian inference + * r-cran-lhs - GNU R package "lhs: Latin Hypercube Samples" + * r-cran-littler - GNU R scripting and command-line front-end + * r-cran-lme4 - GNU R package for linear mixed effects model fitting + * r-cran-lmtest - GNU R package for diagnostic checking in linear models + * r-cran-logspline - GNU R package providing routines for the logspline density estimation + * r-cran-lpsolve - GNU R package providing linear program solvers + * r-cran-lsmeans - GNU R package providing least-squares means for linear, generalized linear, and mixed models + * r-cran-magrittr - GNU R forward-pipe operator + * r-cran-maldiquant - GNU R package for quantitative analysis of mass spectrometry data + * r-cran-maldiquantforeign - GNU R package providing import/export routines for MALDIquant + * r-cran-mapdata - GNU R support for producing geographic maps (supplemental data) + * r-cran-mapproj - GNU R support for cartographic projections of map data + * r-cran-maps - GNU R support for producing geographic maps + * r-cran-mass - GNU R package of Venables and Ripley's MASS + * r-cran-matchit - GNU R package of nonparametric matching methods + * r-cran-matrix - GNU R package of classes for dense and sparse matrices + * r-cran-matrixmodels - GNU R package for sparse and dense matrix models + * r-cran-matrixstats - GNU R methods that apply to rows and columns of a matrix + * r-cran-maxlik - GNU R maximum likelihood estimation + * r-cran-mcmcpack - R routines for Markov chain Monte Carlo model estimation + * r-cran-medadherence - GNU R Medication Adherence: Commonly Used Definitions + * r-cran-memoise - Memoise functions + * r-cran-mfilter - GNU R package providing miscellaneous time series filters + * r-cran-mgcv - GNU R package for multiple parameter smoothing estimation + * r-cran-mime - R package which maps filenames to MIME Types + * r-cran-minqa - GNU R package for quadratic optimisation without derivatives + * r-cran-misc3d - GNU R collection of 3d plot functions and rgl-based isosurfaces + * r-cran-misctools - GNU R miscellaneous tools and utilities + * r-cran-mixtools - GNU R tools for analyzing finite mixture models + * r-cran-mnormt - GNU R package providing multivariate normal and t distribution + * r-cran-mnp - GNU R package for fitting multinomial probit (MNP) models + * r-cran-modeltools - GNU R package providing a collection of tools to deal with statistical models + * r-cran-msm - GNU R Multi-state Markov and hidden Markov models in continuous time + * r-cran-multcomp - GNU R package for multiple comparison procedures + * r-cran-multicore - GNU R parallel processing on multi-core or multi-cpu machines + * r-cran-munsell - Munsell colour system + * r-cran-mvnormtest - GNU R package for multivariate normality test + * r-cran-mvtnorm - GNU R package to compute multivariate Normal and T distributions + * r-cran-ncdf4 - GNU R interface to Unidata netCDF format data files + * r-cran-nlme - GNU R package for (non-)linear mixed effects models + * r-cran-nloptr - GNU R package for interface to NLopt + * r-cran-nnet - GNU R package for feed-forward neural networks + * r-cran-nnls - GNU R package for non-negative least squares (the Lawson-Hanson algorithm) + * r-cran-numderiv - GNU R package for accurate numerical derivatives + * r-cran-nws - GNU R package for distributed programming via NetWorkSpaces + * r-cran-optparse - GNU/R Command line option parser + * r-cran-pbapply - GNU R package providing progress bars for vectorized R functions + * r-cran-pbivnorm - GNU R package for calculating probabilities from a bivariate normal CDF + * r-cran-pbkrtest - GNU R package for tests in linear mixed-effect models + * r-cran-permute - R functions for generating restricted permutations of data + * r-cran-phangorn - GNU R package for phylogenetic analysis + * r-cran-pkgkitten - GNU R package to create simple packages + * r-cran-plotrix - GNU R package providing various plotting functions + * r-cran-plyr - tools for splitting, applying and combining data + * r-cran-polspline - GNU R package providing polynomial spline fitting + * r-cran-polyclip - GNU R Polygon Clipping + * r-cran-polycub - GNU R Cubature over Polygonal Domains + * r-cran-praise - GNU R praise users + * r-cran-profilemodel - GNU R tools for profiling inference functions + * r-cran-proto - Prototype object-based programming + * r-cran-pscl - GNU R package for discrete data models + * r-cran-psy - GNU R procedures for psychometrics + * r-cran-pvclust - Hierarchical Clustering with P-Values via Multiscale Bootstrap + * r-cran-pwt - GNU R package for the Penn World Tables (version 5.6 to 7.1) + * r-cran-pwt8 - GNU R package for the Penn World Tables (version 8) + * r-cran-qtl - GNU R package for genetic marker linkage analysis + * r-cran-quadprog - GNU R package for solving quadratic programming problems + * r-cran-quantreg - GNU R package for quantile regression + * r-cran-qvcalc - GNU R package for computing quasi variances for factor effects + * r-cran-r.methodss3 - GNU R utility function for defining S3 methods + * r-cran-r.oo - GNU R object-oriented programming with or without references + * r-cran-r.utils - GNU R various programming utilities + * r-cran-randomfields - GNU R simulation and analysis of random fields + * r-cran-randomforest - GNU R package implementing the random forest classificator + * r-cran-raschsampler - GNU R package for sampling binary matrices with fixed margins + * r-cran-rcmdr - GNU R platform-independent basic-statistics GUI + * r-cran-rcmdrmisc - GNU R package for miscellaneous Rcmdr utilities + * r-cran-rcolorbrewer - GNU R package providing suitable color palettes + * r-cran-rcpp - GNU R package for Seamless R and C++ Integration + * r-cran-rcpparmadillo - GNU R package for Armadillo C++ linear algebra library + * r-cran-rcppeigen - GNU R package for Eigen templated linear algebra + * r-cran-rcurl - GNU R General network (HTTP/FTP/...) client interface + * r-cran-readbrukerflexdata - GNU R package to read Bruker Daltonics \*flex format files + * r-cran-readmzxmldata - GNU R package to read mass spectrometry data in mzXML format + * r-cran-readxl - GNU R package to read Excel files + * r-cran-relimp - GNU R package for inference on relative importance of regressors + * r-cran-reshape - Flexibly reshape data + * r-cran-reshape2 - Flexibly reshape data: a reboot of the reshape package + * r-cran-rggobi - GNU R package for the GGobi data visualization system + * r-cran-rgl - GNU R package for three-dimensional visualisation using OpenGL + * r-cran-rglpk - GNU R interface to the GNU Linear Programing Kit + * r-cran-rgtk2 - GNU R binding for Gtk2 + * r-cran-rinside - GNU R package to embed R in C++ application + * r-cran-rjags - R interface to the JAGS Bayesian statistics package + * r-cran-rjava - GNU R low-level interface to Java + * r-cran-rjson - GNU R package for converting between R and JSON objects + * r-cran-rmpi - GNU R package interfacing MPI libraries for distributed computing + * r-cran-rms - GNU R regression modeling strategies by Frank Harrell + * r-cran-rmysql - GNU R package providing a DBI-compliant interface to MySQL + * r-cran-rneos - GNU R package with XML-RPC interface to NEOS + * r-cran-rnetcdf - GNU R package that provides an R interface to NetCDF datasets + * r-cran-rniftilib - GNU/R interface to NIFTICLIB + * r-cran-robustbase - GNU R package providing basic robust statistics + * r-cran-rocr - GNU R package to prepare and display ROC curves + * r-cran-rodbc - GNU R package for ODBC database access + * r-cran-rpart - GNU R package for recursive partitioning and regression trees + * r-cran-rpostgresql - GNU R package providing database interface and driver for PostgreSQL + * r-cran-rquantlib - GNU R package interfacing the QuantLib finance library + * r-cran-rsclient - GNU R package providing an Rserve client + * r-cran-rsdmx - GNU R package for the Statistical Data and Metadata Exchange (SDMX) framework + * r-cran-rserve - GNU R Rserve tcp/ip server and sample clients + * r-cran-rsolnp - GNU R general non-linear optimization + * r-cran-rsprng - GNU R interface to SPRNG (Scalable Parallel RNGs) + * r-cran-rsqlite - Database Interface R driver for SQLite + * r-cran-rsymphony - GNU R interface to the SYMPHONY MILP solver + * r-cran-runit - GNU R package providing unit testing framework + * r-cran-sandwich - GNU R package for model-robust standard error estimates + * r-cran-scales - Scale functions for graphics + * r-cran-scatterplot3d - GNU R package for Visualizing Multivariate Data + * r-cran-segmented - GNU R segmented relationships in regression models + * r-cran-sendmailr - send email using GNU R + * r-cran-seqinr - GNU R biological sequences retrieval and analysis + * r-cran-seroincidence - GNU R seroincidence calculator tool + * r-cran-slam - GNU R sparse lighweight arrays and matrices package + * r-cran-sm - GNU R package for kernel smoothing methods + * r-cran-sn - GNU R package providing skew-normal and skew-t distributions + * r-cran-snow - GNU R package for 'simple network of workstations' + * r-cran-sp - GNU R classes and methods for spatial data + * r-cran-spam - GNU R functions for sparse matrix algebra + * r-cran-sparsem - GNU R package for basic linear algebra for sparse matrices + * r-cran-spatial - GNU R package for spatial statistics + * r-cran-spatstat - GNU R Spatial Point Pattern analysis, model-fitting, simulation, tests + * r-cran-spc - GNU R Statistical Process Control + * r-cran-stabledist - GNU R package for stable distribution functions + * r-cran-statmod - GNU R package providing algorithms and functions for statistical modeling + * r-cran-stringi - GNU R character string processing facilities + * r-cran-stringr - Make it easier to work with strings + * r-cran-strucchange - GNU R package for structural change regression estimation + * r-cran-survival - GNU R package for survival analysis + * r-cran-tcltk2 - GNU R package for Tcl/Tk additions + * r-cran-teachingdemos - GNU R Demonstrations for teaching and learning + * r-cran-tensor - GNU R Tensor product of arrays + * r-cran-testthat - GNU R testsuite + * r-cran-tgp - GNU R package "tgp: Bayesian treed Gaussian process models" + * r-cran-th.data - GNU R package for datasets by Torsten Hothorn + * r-cran-timedate - GNU R package for financial engineering -- timeDate + * r-cran-timeseries - GNU R package for financial engineering -- timeSeries + * r-cran-tkrplot - GNU R embedded Tk plotting device package + * r-cran-truncnorm - GNU R truncated normal distribution + * r-cran-tseries - GNU R package for time-series analysis and comp. finance + * r-cran-urca - GNU R package providing unit root and cointegration tests + * r-cran-vcd - GNU R Visualizing Categorical Data + * r-cran-vcdextra - GNU R package providing extensions and additions to the vcd package + * r-cran-vegan - Community Ecology Package for R + * r-cran-vgam - GNU R package for estimating vector generalized additive models + * r-cran-vioplot - GNU R toolbox for violin plots + * r-cran-wdi - GNU R package for accessing the World Development Indicators + * r-cran-xml - GNU R package for XML parsing and generation + * r-cran-xtable - GNU R coerce data to LaTeX and HTML tables + * r-cran-xts - GNU R package for time series analysis -- xts + * r-cran-yaml - Methods to convert R data to YAML and back + * r-cran-zelig - GNU R package providing a unified front-end for estimating statistical models + * r-cran-zoo - GNU R package for totally ordered indexed observations + * r-doc-html - GNU R html manuals for statistical computing system + * r-doc-info - GNU R info manuals statistical computing system + * r-doc-pdf - GNU R pdf manuals for statistical computing system + * r-mathlib - GNU R standalone mathematics library + * r-recommended - GNU R collection of recommended packages [metapackage] + * r-cran-maptools - GNU R Tools for reading and handling spatial objects + * r-cran-surveillance - GNU R package for the Modeling and Monitoring of Epidemic Phenomena + + diff --git a/userguide/zh/portal/README.md b/userguide/zh/portal/README.md new file mode 100644 index 0000000..5a8940d --- /dev/null +++ b/userguide/zh/portal/README.md @@ -0,0 +1,10 @@ +## 页面介绍 ## + +Docklet的主要操作页面: + +* [面板](dashboard.md) +* [配置](config.md) +* [状态](status.md) +* [主机](hosts.md) +* [用户](users.md) +* [管理](admin.md) diff --git a/userguide/zh/portal/admin.md b/userguide/zh/portal/admin.md new file mode 100644 index 0000000..3adb606 --- /dev/null +++ b/userguide/zh/portal/admin.md @@ -0,0 +1,4 @@ +## 管理 ## + +管理(Admin)页面只由具有 *管理员(admin)* 角色的用户访问。 + diff --git a/userguide/zh/portal/config.md b/userguide/zh/portal/config.md new file mode 100644 index 0000000..765c402 --- /dev/null +++ b/userguide/zh/portal/config.md @@ -0,0 +1,20 @@ +## 配置 ## + +配置(Config)是对工作区进行配置操作的界面, +主要是对虚拟集群的节点进行配置,以及对自己的镜像进行管理。 + +下图显示了 *tee* 工作区的配置页面: + +workspace
+config + +在这里,可以为工作区增加或删除节点。上图显示 *tee* 工作区现在有两个节点, +并显示每个节点的名称、IP地址、状态、镜像信息等。 + +可以点击 **Save** 保存工作区,将其做成一个镜像,名为 *myimage*,方便下次使用。 +此时的镜像列表信息如下图所示: + +image config + +可以对保存的镜像进行管理,将其**共享(Share)**出去,造福他人, +免去他们繁琐的配置步骤。 diff --git a/userguide/zh/portal/dashboard.md b/userguide/zh/portal/dashboard.md new file mode 100644 index 0000000..3986afc --- /dev/null +++ b/userguide/zh/portal/dashboard.md @@ -0,0 +1,49 @@ +## 面板 ## + +面板(Dashboard)是工作区(Workspace)的操作界面, +在这里用户可以看到其拥有的所有workspace的状态, +进行工作区的创建,启动,停止,删除操作,并可以进入工作区。 + +初始时工作区列表为空,用户需要创建一个工作区。 +点击 **Add Workspace**,进入创建工作区的页面,选择镜像,创建工作区 +**tee**,如下图所示: + +add 
+workspace + +创建成功后,工作区列表如下图所示: + +dashboard
+stopped status + +图中显示已经创建了一个名为 *tee* 的工作区,其状态为 **Stopped**。 +此时可以进行 **启动(Start)** 和 **删除(Delete)** 工作区的操作。 +被停止的 工作区 不会被销毁,下次仍然可以继续使用它。 +推荐在不运行作业的时候停止 工作区 以减少服务器负载。 + +点击 **Start**,启动工作区,工作区状态改变,如下图所示: + +dashboard
+running status + +此时显示其状态为 **Running**,可进行 **停止(Stop)** 和 **进入(Go)** +工作区的操作。也可以点击工作区的名字 *tee*, +可以进入工作区的 [配置](config.md) 页面;点击状态**Runnig**, +进入工作区的 [状态](status.md) 页面。 + +**注意**: + +1. 如果在刚创建之后马上点击 **Go** ,可能会出现`503: Proxy Target +Missing` 错误,这很可能是因为后台尚未初始化完成。 +稍后再点击 **Go**,就会正常进入工作区。 + +1. 如果用户点击 **Go** 时出现 + +``` +Not Found + +The requested URL was not found on the server. If you entered the URL +manually please check your spelling and try again. +``` + +可能是因为系统因为某种原因重启,用户需要自己重新启动Workspace,然后再进入。 diff --git a/userguide/zh/portal/hosts.md b/userguide/zh/portal/hosts.md new file mode 100644 index 0000000..e920738 --- /dev/null +++ b/userguide/zh/portal/hosts.md @@ -0,0 +1,22 @@ +## 主机 ## + +主机(Hosts)页面仅由具有 *管理员(admin)* 角色的用户访问。 + +主机页面显示了Docklet物理集群中各主机的状态和配置信息, +包括IP地址、状态、运行的容器节点、所消耗的资源等。 + +下图展示了一个测试环境中的Docklet物理集群信息: + +hosts + +点击节点的 **Realtime** 按钮,可以看到该主机的配置信息: + +
+summary config of one host + +也可以看到该主机的实时资源消耗: + +
+realtime performance of one host + + diff --git a/userguide/zh/portal/status.md b/userguide/zh/portal/status.md new file mode 100644 index 0000000..73d6eb3 --- /dev/null +++ b/userguide/zh/portal/status.md @@ -0,0 +1,12 @@ +### 状态 ### + +状态(Status) 是虚拟集群的监控界面。在这里,可以看到工作区所在集群的负载情况, +包括CPU和Memory的使用情况。 + +下图展示了工作区 *tee* 所在虚拟集群的状态概览: + +vcluster status + +点击 **Realtime** ,可看到节点的CPU和Memory的实时使用情况,如下图所示: + +vcluster status summary of one node diff --git a/userguide/zh/portal/users.md b/userguide/zh/portal/users.md new file mode 100644 index 0000000..dc64be9 --- /dev/null +++ b/userguide/zh/portal/users.md @@ -0,0 +1,22 @@ +## 用户 ## + +用户(Users)页面仅由具有 **管理员(admin)** 权限的用户访问。 +该页面提供对docklet中用户信息展示、修改功能。 +同时可以添加用户、用户组。 + +### 注册与激活 ### +目前docklet不支持自由注册,可以使用外部的认证系统登录(如pam)。 + +第一次使用外部账号认证登录时,账号处于未激活状态,登录之后页面左上角会显示一条待激活的通知,未激活的账号不能使用一些特定功能。 + +add workspace + +点击该通知之后会进入激活页面,需要在该页面中填写E-mail、学号/职工号、部门、真实姓名、注册原因,提交申请之后,管理员会在3天之内审核该申请。 + +提交申请表单之后浏览器会自动登出,需要重新登录刷新用户信息。在审核期间docklet页面的左上角会显示等待审核的通知,点击即可关闭。 + +## 试用 ## +docklet支持用户以访客模式试用,在你的帐号审核通过之前,你可以通过点击主页的have a try按钮进入 + + +试用界面,在试用模式下,你可以点击Go按钮进入到一个workspace中。 diff --git a/userguide/zh/software/README.md b/userguide/zh/software/README.md new file mode 100644 index 0000000..727ee3b --- /dev/null +++ b/userguide/zh/software/README.md @@ -0,0 +1,8 @@ +# 在终端中开发 # + + * [安装的软件包列表](list.md) + * [C/C++示例](c.md) + * [Java示例](java.md) + * [MPI示例](mpi.md) + * [Spark示例](spark.md) + * [LaTeX示例](latex.md) diff --git a/userguide/zh/software/c.md b/userguide/zh/software/c.md new file mode 100644 index 0000000..3a372e3 --- /dev/null +++ b/userguide/zh/software/c.md @@ -0,0 +1,167 @@ +## C/C++ 开发示例 ## + +Docklet提供了GCC 5+ 和 GDB,用于帮助开发 C 程序。 + +### C + +文件 `sum.c` 如下: + +```c +#include +#define TOP 50 + +int main() +{ + int i = 0 , sum = 0 ; + while ( i <= TOP ) { + sum += i ; + i++ ; + } + printf(" Sum of 1 to %d is %d\n", TOP, sum ) ; + return 0 ; +} +``` + +进入WEB Terminal,编译并运行 + +``` +$ gcc -g sum.c -o sum +$ ./sum + Sum of 1 to 50 is 1275 +``` + +### C++ + +下面显示了一个有错的C++程序`main.cpp`,原文来源于[这里](http://www.cprogramming.com/gdb.html) + +```c++ +#include + +using namespace std; + +long factorial(int n); + +int main() +{ + int n(0); + cin>>n; + long val=factorial(n); + cout<... +8. Reading symbols from /root/main...done. +9. (gdb) break 11 +10. Breakpoint 1 at 0x80485f9: file main.cpp, line 11. +11. (gdb) run +12. Starting program: /root/main +13. 3 +14. +15. Breakpoint 1, main () at main.cpp:11 +16. 11 long val=factorial(n); +17. (gdb) step +18. factorial (n=3) at main.cpp:19 +19. 19 long result(1); +20. (gdb) list +21. 14 return 0; +22. 15 } +23. 16 +24. 17 long factorial(int n) +25. 18 { +26. 19 long result(1); +27. 20 while(n--) +28. 21 { +29. 22 result*=n; +30. 23 } +31. (gdb) watch n +32. Hardware watchpoint 2: n +33. (gdb) watch result +34. Hardware watchpoint 3: result +35. (gdb) continue +36. Continuing. +37. Hardware watchpoint 3: result +38. +39. Old value = 0 +40. New value = 1 +``` + +`result` 变量初始化为1。 + +``` +41. factorial (n=3) at main.cpp:20 +42. 20 while(n--) +43. (gdb) +``` + +注意刚才没有输入任何命令,只是输入**回车**,它会重复执行上一条命令。 + +``` +44. Continuing. +45. Hardware watchpoint 2: n +46. +47. Old value = 3 +48. New value = 2 +``` + +注意 `n` 从 3 变成了 2. + +``` +49. 0x08048654 in factorial (n=2) at main.cpp:20 +50. 20 while(n--) +51. (gdb) +52. Continuing. +53. Hardware watchpoint 3: result +54. +55. Old value = 1 +56. New value = 2 +``` + +此时 `result`变成了 2。我们已经发现了一个bug: `result`最终的值应为 `3*2*1`, +但这里只是从 2 开始计算,问题出在 `while (n--)`上。 + +将代码修改为如下 + +```c++ +while(n>0) //doesn't let n reach 0 +{ + result*=n; + n--; //decrements only after the evaluation +} +``` + +测试运行通过。 + +[原文](http://www.cprogramming.com/gdb.html) 给出了更详细的调试过程。 + +GDB的官方手册请参考 +[GDB手册](http://sourceware.org/gdb/current/onlinedocs/gdb/)。 diff --git a/userguide/zh/software/java.md b/userguide/zh/software/java.md new file mode 100644 index 0000000..206a3d4 --- /dev/null +++ b/userguide/zh/software/java.md @@ -0,0 +1,34 @@ +## Java + +Docklet 提供了 OpenJDK 7+,用于开发Java程序。 + +一个计算阶乘的Java程序 NumberFactorial.java 如下: + +```java +public class NumberFactorial { + public static void main(String[] args) { + int number = 5; + /* + * Factorial of any number is !n. + * For example, factorial of 4 is 4*3*2*1. + */ + + int factorial = number; + + for(int i =(number - 1); i > 1; i--) { + factorial = factorial * i; + } + + System.out.println("Factorial of a number is " + factorial); + } +} +``` + +进入 WEB Terminal,编译运行 + +``` +$ javac NumberFactorial.java +$ java NumberFactorial +Factorial of a number is 120 +``` + diff --git a/userguide/zh/software/latex.md b/userguide/zh/software/latex.md new file mode 100644 index 0000000..a96d40d --- /dev/null +++ b/userguide/zh/software/latex.md @@ -0,0 +1,475 @@ +## LaTeX 示例 ## + +Docklet 安装了[TeX Live 2015](http://www.tug.org/texlive)的全部包,可以 +直接在 WEB Terminal中编写和编译LaTeX ,生成 pdf 文件,在 Jupyter +Dashboard 中打开 pdf 文件查看结果。 + +### Hello World + +hello.tex 源码: + +```latex +\documentclass{article} +\title{My \LaTeX ~Hello World} +\author{Zhang San} +\date{Feb 20, 2016} +\begin{document} +\maketitle +Hello World! +\end{document} +``` + +用`pdflatex` 或 `xelatex`编译: + +``` +$ xelatex hello +$ ls +hello.aux hello.log hello.pdf hello.tex +``` + +### 中文支持 + +TeX Live 2015 中,`xeCJK` + `xelatex`提供了对中文、日文、韩文等字体非常好的支持, +用户只需要使用`xeCJK`包即可以使用系统自带的中日韩矢量字体,不需要做任何字体生成工作。 + + +让 hello.tex 包含中文 : + +```latex +\documentclass{article} + +\usepackage{xeCJK} % CJK font support + +\title{我的 \LaTeX ~世界你好} +\author{张三} +\date{2016年2月20日} +\begin{document} +\maketitle +世界,你好! +\end{document} +``` + +用`xelatex`编译 + +``` +$ xelatex hello +$ ls +hello.aux hello.log hello.pdf hello.tex +``` + +如果对系统自动选定的字体不满意,可以自己指定字体。 + +首先查看系统的中文字体: + +``` +$ fc-list :lang=zh +/usr/share/fonts/truetype/arphic/uming.ttc: AR PL UMing TW MBE:style=Light +/usr/share/fonts/truetype/arphic/ukai.ttc: AR PL UKai CN:style=Book +/usr/share/fonts/truetype/arphic/ukai.ttc: AR PL UKai HK:style=Book +/usr/share/fonts/truetype/arphic/ukai.ttc: AR PL UKai TW:style=Book +/usr/share/fonts/truetype/wqy/wqy-microhei.ttc: WenQuanYi Micro Hei,文泉驛微米黑,文泉驿微米黑:style=Regular +/usr/share/fonts/truetype/wqy/wqy-zenhei.ttc: WenQuanYi Zen Hei,文泉驛正黑,文泉驿正黑:style=Regular +/usr/share/fonts/truetype/wqy/wqy-zenhei.ttc: WenQuanYi Zen Hei Sharp,文泉驛點陣正黑,文泉驿点阵正黑:style=Regular +/usr/share/fonts/truetype/droid/DroidSansFallbackFull.ttf: Droid Sans Fallback:style=Regular +/usr/share/fonts/truetype/arphic-gkai00mp/gkai00mp.ttf: AR PL KaitiM GB,文鼎PL简中楷:style=Regular +/usr/share/fonts/truetype/arphic-gbsn00lp/gbsn00lp.ttf: AR PL SungtiL GB,文鼎PL简报宋:style=Regular +/usr/share/fonts/truetype/arphic/ukai.ttc: AR PL UKai TW MBE:style=Book +/usr/share/fonts/truetype/arphic/uming.ttc: AR PL UMing TW:style=Light +/usr/share/fonts/truetype/wqy/wqy-zenhei.ttc: WenQuanYi Zen Hei Mono,文泉驛等寬正黑,文泉驿等宽正黑:style=Regular +/usr/share/fonts/truetype/arphic/uming.ttc: AR PL UMing CN:style=Light +/usr/share/fonts/truetype/arphic/uming.ttc: AR PL UMing HK:style=Light +/usr/share/fonts/truetype/wqy/wqy-microhei.ttc: WenQuanYi Micro Hei Mono,文泉驛等寬微米黑,文泉驿等宽微米黑:style=Regular +``` + +指定 CJK 字体 + +```latex +\documentclass{article} + +\usepackage{indentfirst} % 中文首行缩进 +\setlength{\parindent}{2em} % 首行空两字 + +\usepackage{xeCJK} % 中文字体支持 +\setCJKmainfont[ItalicFont={AR PL KaitiM GB}]{AR PL KaitiM GB} % 缺省中文字体 +\setCJKsansfont{WenQuanYi Zen Hei} % 无衬线字体,可选设置 +\setCJKmonofont{WenQuanYi Zen Hei Mono} % 等宽字体,可选设置 + +\title{我的 \LaTeX ~程序} +\author{张三} +\date{2016年2月20日} +\begin{document} +\maketitle + +\section{序言} + +世界,你好! + +\textbf{世界},\textit{你好!} + +\textsf{无衬线字体},\texttt{等宽字体!} + +\end{document} + +``` + + +### 一个完整的中文LaTeX 例子 + +下面是来自北京大学于江生老师的一个例子,这里对[原文](https://wiki.freebsdchina.org/doc/x/xelatex) +做了稍许修改,以方便编译测试: + +```latex +\documentclass[12pt,a4paper]{article} +\setlength{\parindent}{2em} % 首行空两字 +\usepackage{fontspec} % 设置字体 +%\setmainfont{宋体} +\usepackage{indentfirst} % 首行缩进 +\usepackage{xeCJK} % 中英文混排 + +\usepackage{tikz} + +%%%%%%%%%% 数学符号公式 %%%%%%%%%% +\usepackage{amsmath} % AMS LaTeX宏包 +%\usepackage{amssymb} % 用来排版漂亮的数学公式 +%\usepackage{amsbsy} +\usepackage{amsthm} +\usepackage{amsfonts} +\usepackage{mathrsfs} % 英文花体字体 +\usepackage{bm} % 数学公式中的黑斜体 +\usepackage{bbding,manfnt} % 一些图标,如 \dbend +\usepackage{lettrine} % 首字下沉,命令\lettrine +\def\attention{\lettrine[lines=2,lraise=0,nindent=0em]{\large\textdbend\hspace{1mm}}{}} +%\usepackage{relsize} % 调整公式字体大小:\mathsmaller,\mathlarger +%\usepackage{caption2} % 浮动图形和表格标题样式 + +%%%%%%%%%% 图形支持宏包 %%%%%%%%%% +\usepackage{graphicx} % 嵌入png图像 +\usepackage{color,xcolor} % 支持彩色文本、底色、文本框等 +%\usepackage{subfigure} +%\usepackage{epsfig} % 支持eps图像 +%\usepackage{picinpar} % 图表和文字混排宏包 +%\usepackage[verbose]{wrapfig} % 图表和文字混排宏包 +%\usepackage{eso-pic} % 向文档的部分页加n副图形, 可实现水印效果 +%\usepackage{eepic} % 扩展的绘图支持 +%\usepackage{curves} % 绘制复杂曲线 +%\usepackage{texdraw} % 增强的绘图工具 +%\usepackage{treedoc} % 树形图绘制 +%\usepackage{pictex} % 可以画任意的图形 +%\usepackage{hyperref} + +%%%%%%%%%% 粘贴源代码 %%%%%%%%%% +\usepackage{listings} % 粘贴源代码 +\lstloadlanguages{R, C, csh, make} % 所要粘贴代码的编程语言 +\lstdefinelanguage{Renhanced}[]{R}{% + morekeywords={acf,ar,arima,arima.sim,colMeans,colSums,is.na,is.null,% + mapply,ms,na.rm,nlmin,replicate,row.names,rowMeans,rowSums,seasonal,% + sys.time,system.time,ts.plot,which.max,which.min}, + deletekeywords={c}, + alsoletter={.\%},% + alsoother={:_\$}} +\newcommand{\indexfonction}[1]{\index{#1@\texttt{#1}}} +\lstset{language=Renhanced,tabsize=4, keepspaces=true, + xleftmargin=2em,xrightmargin=0em, aboveskip=1em, + backgroundcolor=\color{gray!20}, % 定义背景颜色 + frame=none, % 表示不要边框 + extendedchars=false, % 解决代码跨页时,章节标题,页眉等汉字不显示的问题 + basicstyle=\small, + keywordstyle=\color{black}\bfseries, + breakindent=10pt, + identifierstyle=, % nothing happens + commentstyle=\color{blue}\small, % 注释的设置 + morecomment=[l][\color{blue}]{\#}, + numbers=left,stepnumber=1,numberstyle=\scriptsize, + showstringspaces=false, + showspaces=false, + flexiblecolumns=true, + breaklines=true, breakautoindent=true,breakindent=4em, + escapeinside={/*@}{@*/}, +} + +%%%%%%%%%% 正文 %%%%%%%%%% +\begin{document} +%%%%%%%%%% 定理类环境的定义 %%%%%%%%%% +%% 必须在导入中文环境之后 +\newtheorem{example}{例} % 整体编号 +\newtheorem{algorithm}{算法} +\newtheorem{theorem}{定理}[section] % 按 section 编号 +\newtheorem{definition}{定义} +\newtheorem{axiom}{公理} +\newtheorem{property}{性质} +\newtheorem{proposition}{命题} +\newtheorem{lemma}{引理} +\newtheorem{corollary}{推论} +\newtheorem{remark}{注解} +\newtheorem{condition}{条件} +\newtheorem{conclusion}{结论} +\newtheorem{assumption}{假设} + +%%%%%%%%%% 一些重定义 %%%%%%%%%% +\renewcommand{\contentsname}{目录} % 将Contents改为目录 +\renewcommand{\abstractname}{摘要} % 将Abstract改为摘要 +\renewcommand{\refname}{参考文献} % 将References改为参考文献 +\renewcommand{\indexname}{索引} +\renewcommand{\figurename}{图} +\renewcommand{\tablename}{表} +\renewcommand{\appendixname}{附录} +\renewcommand{\proofname}{证明} +\renewcommand{\algorithm}{算法} + +%%%%%%%%%% 论文标题、作者等 %%%%%%%%%% +\title{用\LaTeX 写科技论文\thanks{这是一个为初学者写的\TeX 论文模板, +未经作者允许可以随意下载使用并修改传播,目的是让更多的人迅速上手用\TeX 系统写作。} + } +\author{于江生,北京大学计算机系} +%\date{2008年10月01日} % 日期 +\date{} +\maketitle % 生成标题 +\tableofcontents % 插入目录 +\thispagestyle{empty} % 首页无页眉页脚 + +\begin{abstract} +\noindent 这是一个简单的中文\TeX 模板,为\TeX 的初学者提供便利上手的参照。 +该模板在 \TeX Live 下通过xelatex命令生成PDF文件,适合在类UNIX操作系统下工作的朋友从一个简单的模板出发, +不断地提升对\TeX 的认识。注意:若想用 xelatex命令,\TeX 文件必须按照 UTF-8 编码保存。 +因为 Xe\TeX 是一种使用 Unicode 编码的 \TeX 系统,它对中文的支持是发自肺腹的,免去了繁复的配置。 +\end{abstract} + +%\PencilRightUp % 一些可爱的图标,需要bbding宏包的支持 +公元1974年,ACM图灵奖授予了Standford大学教授\index{Donald E. Knuth} Donald E. Knuth (高德纳), +表彰他在算法和程序语言设计等多方面杰出的成就。他的巨著 The Art of Computer Programming 令人震撼。 +另外,Knuth的突出贡献还包括\index{\TeX 系统} \TeX 系统,毫不夸张地评价,\TeX 给排版带来了一场革命。 +%%%%%%%%%% section %%%%%%%%%% +\section{编辑数学公式} +\indent % 恢复缩进 +Knuth 用\$ 符号界定数学公式,暗指着每个好的公式都是无价之宝。有了\TeX 系统,输入数学公式变得简单愉快。如, + +\begin{theorem}[L\'{e}vy\index{L\'{e}vy 定理}] +令 $F(x),\varphi(t)$ 分别为随机变量 $X$ 的分布函数和特征函数。 +假定 $F(x)$ 在 $a+h$ 和 $a-h (h>0)$ 处连续,则有 +\begin{align} + \label{Levy theorem} % 方程的标记可以是专有名词 +F(a+h)-F(a-h)&=\lim_{T\rightarrow\infty}\frac{1}{\pi}\int^{T}_{-T}\frac{\sin ht}{t} +e^{-ita}\varphi(t)dt +\end{align} +\end{theorem} +\begin{proof} + 从略。感兴趣的读者可以参考……。 +\end{proof} + + +\begin{corollary} +密度函数和特征函数之间有如下的关系。 +\begin{align} + \label{DensityCharacteristic} % 自定义的标记 + f(x)&=\frac{1}{2\pi}\int^{+\infty}_{-\infty} e^{-itx}\varphi(t)dt +\end{align} +\end{corollary} + +\begin{proof} +由公式 (\ref{Levy theorem}) 和 Lebesgue 定理,我们有 +\begin{align*} + \frac{F(x+\Delta x)-F(x)}{\Delta x}&=\frac{1}{2\pi}\int^{+\infty}_{-\infty} + \frac{\sin(t\Delta x/2)}{t\Delta x/2} e^{-it(x+\Delta x/2)}\varphi(t) dt\\ + f(x)&=\frac{1}{2\pi}\int^{+\infty}_{-\infty}\lim_{\Delta x\rightarrow 0} + \frac{\sin(t\Delta x/2)}{t\Delta x/2} e^{-it(x+\Delta x/2)}\varphi(t) dt\\ + &=\frac{1}{2\pi}\int^{+\infty}_{-\infty} e^{-itx}\varphi(t)dt\qedhere +\end{align*} +\end{proof} + +我们知道特征函数的定义是 + +\begin{align} + \label{section1:characteristic} % 标记中注明了章节号 + \varphi(t)&= E(e^{itX})=\int^{+\infty}_{-\infty} e^{itx} f(x)dx +\end{align} + +L\'{e}vy 定理在分布函数和特征函数之间搭建了一座桥梁。 +对比 (\ref{DensityCharacteristic}) 和 (\ref{section1:characteristic}) 可见, +密度函数和特征函数之间的关系非常巧妙。 + + +\attention 在\TeX 环境里,数学公式的表达是很自然的,绝大多数命令就是英文的数学专有名词或它们的缩写, +如果你以前读过英文的数学文献,记忆这些命令是不难的。如果你没读过,正好通过记忆这些命令来了解术语。 + + +手头有个命令快速寻查表是很方便的,我用的是 Hypertext Help with \LaTeX,网上可以搜到,是免费的。 + +%%%%%%%%%%% section %%%%%%%%%% +\section{图形表格等浮动对象} + +\index{贝叶斯方法}贝叶斯方法\cite{Gelman} 主要用于小样本数据分析,它利用参数先验分布和 +后验分布之差异进行统计推断,其一般步骤是: + +\begin{enumerate} + \item 构建概率模型,包括参数的先验分布。 + \item 给定观察数据,计算参数的后验分布。 + \item 分析模型的效果,如有必要,回到第一步。 +\end{enumerate} + +\begin{example} +下面,我们给一个表格的例子,一个图形的例子。 + +\begin{center} +\begin{table}[!ht] % 强制在原位显示表格 +\centering +\caption{二维随机向量$(X,Y)$的边缘分布} +\begin{tabular}{l|ccccc|c} + $_X$\hspace{3mm} $^Y$&$y_1$&$y_2$&$\cdots$&$y_j$&$\cdots$\\ +\hline +$x_1$ &$p_{11}$&$p_{12}$&$\cdots$&$p_{1j}$&$\cdots$&$p_{1\cdot}$\\ +$x_2$ &$p_{21}$&$p_{22}$&$\cdots$&$p_{2j}$&$\cdots$&$p_{2\cdot}$\\ +$\vdots$&$\vdots$&$\vdots$&$\vdots$&$\vdots$&$\vdots$&$\vdots$\\ +$x_i$ &$p_{i1}$&$p_{i2}$&$\cdots$&$p_{ij}$&$\cdots$&$p_{i\cdot}$\\ +$\vdots$&$\vdots$&$\vdots$&$\vdots$&$\vdots$&$\vdots$&$\vdots$\\ +\hline + &$p_{\cdot 1}$&$p_{\cdot 2}$&$\cdots$&$p_{\cdot j}$&$\cdots$&1 +\label{marginal distribution} +\end{tabular} +\end{table} +\end{center} + +在表\ref{marginal distribution} 中,$p_{\cdot j}=\sum\limits_i p_{ij}$, +类似地,$ p_{i\cdot}=\sum\limits_j p_{ij}$。 +\end{example} + +% 插入一个图片 +%\begin{center} +%\begin{figure}[!h] +%\centering +%\includegraphics[width=0.95\textwidth]{knot.png} +%\caption{吞尾的环面和纽结,由 Maxima 绘制。\hfill\mbox{}} +%\label{torus and knot} +%\end{figure} +%\end{center} + +%原文用的是knot.png文件,这里采用了一个在latex中直接画图的例子 +图 \ref{tikz graph} 是来自 http://www.texample.net/ 的一个tikz示例。 + +\begin{center} +\begin{figure}[!hbt] +\centering +\scalebox{0.8}{ +\begin{tikzpicture}[level/.style={sibling distance=60mm/#1}] +\node [circle,draw] (z){$n$} + child {node [circle,draw] (a) {$\frac{n}{2}$} + child {node [circle,draw] (b) {$\frac{n}{2^2}$} + child {node {$\vdots$} + child {node [circle,draw] (d) {$\frac{n}{2^k}$}} + child {node [circle,draw] (e) {$\frac{n}{2^k}$}} + } + child {node {$\vdots$}} + } + child {node [circle,draw] (g) {$\frac{n}{2^2}$} + child {node {$\vdots$}} + child {node {$\vdots$}} + } + } + child {node [circle,draw] (j) {$\frac{n}{2}$} + child {node [circle,draw] (k) {$\frac{n}{2^2}$} + child {node {$\vdots$}} + child {node {$\vdots$}} + } + child {node [circle,draw] (l) {$\frac{n}{2^2}$} + child {node {$\vdots$}} + child {node (c){$\vdots$} + child {node [circle,draw] (o) {$\frac{n}{2^k}$}} + child {node [circle,draw] (p) {$\frac{n}{2^k}$} + child [grow=right] {node (q) {$=$} edge from parent[draw=none] + child [grow=right] {node (q) {$O_{k = \lg n}(n)$} edge from parent[draw=none] + child [grow=up] {node (r) {$\vdots$} edge from parent[draw=none] + child [grow=up] {node (s) {$O_2(n)$} edge from parent[draw=none] + child [grow=up] {node (t) {$O_1(n)$} edge from parent[draw=none] + child [grow=up] {node (u) {$O_0(n)$} edge from parent[draw=none]} + } + } + } + child [grow=down] {node (v) {$O(n \cdot \lg n)$}edge from parent[draw=none]} + } + } + } + } + } +}; +\path (a) -- (j) node [midway] {+}; +\path (b) -- (g) node [midway] {+}; +\path (k) -- (l) node [midway] {+}; +\path (k) -- (g) node [midway] {+}; +\path (d) -- (e) node [midway] {+}; +\path (o) -- (p) node [midway] {+}; +\path (o) -- (e) node (x) [midway] {$\cdots$} + child [grow=down] { + node (y) {$O\left(\displaystyle\sum_{i = 0}^k 2^i \cdot \frac{n}{2^i}\right)$} + edge from parent[draw=none] + }; +\path (q) -- (r) node [midway] {+}; +\path (s) -- (r) node [midway] {+}; +\path (s) -- (t) node [midway] {+}; +\path (s) -- (l) node [midway] {=}; +\path (t) -- (u) node [midway] {+}; +\path (z) -- (u) node [midway] {=}; +\path (j) -- (t) node [midway] {=}; +\path (y) -- (x) node [midway] {$\Downarrow$}; +\path (v) -- (y) + node (w) [midway] {$O\left(\displaystyle\sum_{i = 0}^k n\right) = O(k \cdot n)$}; +\path (q) -- (v) node [midway] {=}; +\path (e) -- (x) node [midway] {+}; +\path (o) -- (x) node [midway] {+}; +\path (y) -- (w) node [midway] {$=$}; +\path (v) -- (w) node [midway] {$\Leftrightarrow$}; +\path (r) -- (c) node [midway] {$\cdots$}; +\end{tikzpicture}} +\caption{一个简单的tikz/pgf图例} +\label{tikz graph} +\end{figure} +\end{center} + + +%%%%%%%%%%% section %%%%%%%%%% +\section{如何张贴源码?} +使用 listings 宏包,可以将R、Maxima等语言的源码以某种固定的模式张贴出来。譬如, + +\begin{lstlisting} +## 生日问题:n <= 365 个人中至少两人生日相同的概率? +## 输出: n 个人当中至少两人生日相同的概率 P(A) +## 注意:R 语言中,变量有大小写的区分 +N <- 365 # 一年的天数 +n <- 50 # 选取的人数。 +InitProb <- matrix(1,n,1) # 一个 n 维的列向量的初始化 + +## 计算 n 个人当中没有人生日相同的概率 +for (i in 2:n){ + InitProb[i] <- InitProb[i-1] * (N-i+1)/N +} +Prob <- 1 - InitProb # 生日问题的解,输出一个 n 维列向量 +idx <- n - sum(Prob>0.5) + 1 # 概率大于 50% 所需最少人数 +\end{lstlisting} + +%%%%%%%%%%% section %%%%%%%%%% +\section{后记} +这个\TeX 模板只是为了提供一个学习\TeX 的参考,各节的内容并没有关联性。欢迎读者使用并改进该模板, +并祝学习\TeX 愉快! + +Knuth大师最初设计\TeX 的时候并没有想到中文化,\TeX 排版系统的中文化始终令初学者望而却步、云山雾罩。 +类UNIX系统下的teTeX和Windows系统下的MikTeX,都是\TeX 知名的发行版。然而,teTeX已经停止研发五年之久, +基于MikTeX的中文发行版CTeX 虽然如火如荼,但依然挡不住\TeX Live 一统江湖的大趋势。 + +虽然\TeX Live 还未入住FreeBSD的ports tree,但teTeX的远去, +令FreeBSD之下的很多ports不得不面临改换门庭的窘境。例如,auctex、latex-cjk等等。 + + +\TeX 的中文化可以有多种途径,xelatex 是其中最简单的(不见得是最美观的)。 +在\TeX Live 2011 之下,不需要有任何更多的设置,甚至不用考虑中英文混排, +xelatex能满足绝大多数中文化要求。这对于初学者来说,无疑是一个福音。 + + +%%%%%%%%%% 参考文献 %%%%%%%%%% +\begin{thebibliography}{} +\bibitem[Gelman et al., 2004]{Gelman} Gelman, A., Carlin, J. B., Stern, H. S. \& Rubin, D. B. + (2004) Bayesian Data Analysis (Second Edition). \newblock Chapman \& Hall/CRC. +\end{thebibliography} +\clearpage +\end{document} +%%%%%%%%%% 结束 %%%%%%%%%% +``` + +用xelatex 编译后可以生成中文 pdf 文件。 diff --git a/userguide/zh/software/list.md b/userguide/zh/software/list.md new file mode 100644 index 0000000..543ad67 --- /dev/null +++ b/userguide/zh/software/list.md @@ -0,0 +1,38 @@ +## 软件列表 ## + +Docklet的公共基础镜像提供了如下软件: + +** System ** +* vim emcas nano +* tmux w3m curl +* zsh +* unrar,p7zip-full,p7zip-rar +* openssh-server +* openbox +* vpnc ( web ) +* fonts-droid, fonts-wqy-zenhei, fonts-wqy-microhei, fonts-arphic-gbsn00lp, fonts-arphic-gkai00mp, fonts-arphic-ukai, fonts-arphic-uming latex-cjk-all + +** Language ** +* gcc,g++,gdb,make,autoconf, gfortran, scons +* git, subversion +* openjdk-7-jdk +* python3, pip, virtualenv,pep8,pyflakes +* scala +* haskell-platform +* ruby-full +* clisp, guile +* golang +* nodejs, npm, node-legacy +* r-base, r-recommended, r-cran-\* +* texlive-full +* pandoc,xsltproc,markdown + +** Tools ** +* scipy, numpy, matpotlib, pandas,sympy, plotly, statsmodels, scikit-learn, nltk(with data), tensorflow +* jupyter notebook +* spark +* mpich2 +* storm +* elasticsearch +* kafka/zmq +* couchdb/mongodb/memcached/redis-server diff --git a/userguide/zh/software/mpi.md b/userguide/zh/software/mpi.md new file mode 100644 index 0000000..d1a4d34 --- /dev/null +++ b/userguide/zh/software/mpi.md @@ -0,0 +1,57 @@ +## MPI 示例 ## + +Docklet提供了 [MPICH2](http://www.mpich.org) 软件包。 + +假定vcluster目前有两个节点 host-0 和 host-1 + +进入 WEB Terminal。 + +``` +$ cat /etc/hosts +127.0.0.1 localhost +172.16.0.46 host-0 host-0.tee +172.16.0.45 host-1 host-1.tee +``` + +获得节点列表 + +``` +$ cat /etc/hosts | grep -v localhost | awk '{print $2}' +host-0 +host-1 +``` + +WEB Terminal进入的是host-0,在host-0 上编辑 mpihello.c + +```c +#include "mpi.h" +#include +#include +#define MASTER 0 + +int main (int argc, char *argv[]) +{ + int numtasks, taskid, len; + char hostname[MPI_MAX_PROCESSOR_NAME]; + + MPI_Init(&argc, &argv); + MPI_Comm_size(MPI_COMM_WORLD, &numtasks); + MPI_Comm_rank(MPI_COMM_WORLD,&taskid); + MPI_Get_processor_name(hostname, &len); + printf ("Hello from task %d on %s!\n", taskid, hostname); + if (taskid == MASTER) + printf("MASTER: Number of MPI tasks is: %d\n",numtasks); + MPI_Finalize(); + return 0 ; +} +``` + +在WEB Terminal中运行: + +``` +$ mpicc mpihello.c -o mpihello +$ mpirun -n 2 -hosts host-0,host-1 mpihello +Hello from task 0 on host-0! +MASTER: Number of MPI tasks is: 2 +Hello from task 1 on host-1! +``` diff --git a/userguide/zh/software/spark.md b/userguide/zh/software/spark.md new file mode 100644 index 0000000..61ae3a3 --- /dev/null +++ b/userguide/zh/software/spark.md @@ -0,0 +1 @@ +## Spark 示例 ## diff --git a/userguide/zh/workspace/README.md b/userguide/zh/workspace/README.md new file mode 100644 index 0000000..e1b4e43 --- /dev/null +++ b/userguide/zh/workspace/README.md @@ -0,0 +1,16 @@ +# 工作区 # + +Docklet的工作区(Workspace)是由集成的[Jupyter +Notebook](https://github.com/jupyter/notebook) 实现的, +如何使用 Jupyter Notebook 已有很多文档, +建议用户访问最新的官方文档 [The Jupyter notebook](http://jupyter-notebook.readthedocs.org/en/latest/index.html) 获取帮助,其中的 [UI Components](http://jupyter-notebook.readthedocs.org/en/latest/ui_components.html) 介绍了Jupyter Notebook 界面,其主工作面板 (dashboard) 是所有功能操作的入口,如下图所示: + + + +本节将主要关注 Jupyter Notebook 作为工作区的如下功能: + +* [数据管理](data.md) +* [文件编辑](edit.md) +* [终端操作](terminal.md) +* [交互编程](notebook.md) diff --git a/userguide/zh/workspace/data.md b/userguide/zh/workspace/data.md new file mode 100644 index 0000000..79f6d53 --- /dev/null +++ b/userguide/zh/workspace/data.md @@ -0,0 +1,14 @@ +## 数据管理 ## + +进入工作区后,会看到树状显示的文件、目录列表, +选中文件后可以进行**改名(Rename)** 和 **删除(Delete)** 操作。 + +用户点击 **New** - **Folder** 可以创建目录;点击 **New** - ** Text File +** 可以创建一个文本文件 *untitled.txt*,并进入 [文件编辑](edit.md) 界面 +进行编辑。 + +点击**Upload** 按钮,用户可以上传数据文件。 + +**注意**: `~/nfs` 目录是数据区,其中的数据在多个容器节点间共享, +会持久存储。其他目录的文件都是临时的,会随着容器节点的销毁而销毁。 + diff --git a/userguide/zh/workspace/edit.md b/userguide/zh/workspace/edit.md new file mode 100644 index 0000000..ba390d9 --- /dev/null +++ b/userguide/zh/workspace/edit.md @@ -0,0 +1,14 @@ +## 文件编辑 ## + +在Jupyter Notebook的主面板中点击文本文件,如 .c .py .md .txt .log 等文件 +,会进入文本文件编辑界面。 + +用户可以给文件改名,如将 untitled.txt 改为 hello.c 。 + +熟悉 vi 或 emacs 的用户可以在 **Edit** 菜单选择 **Vim** 或 **emacs** 编 +辑模式,缺省是 **Sublime Text** 编辑模式。 + +文本编辑器可以识别几十种语言,自动高亮显示关键字,自动对齐。 + +**注意**: 文本编辑器目前只支持编辑小文件,对于大数据文本文件,点击编辑可 +能会造成 ** 系统挂起 **。 diff --git a/userguide/zh/workspace/notebook.md b/userguide/zh/workspace/notebook.md new file mode 100644 index 0000000..4da1cbd --- /dev/null +++ b/userguide/zh/workspace/notebook.md @@ -0,0 +1,10 @@ +## 交互编程 ## + +通过浏览器在线 **交互编程** 是Jupyter Notebook非常吸引人的地方, +通过Notebook可以在浏览器里执行代码并显示结果, +使用富媒体方式如HTML, LaTeX, PNG, SVN等显示计算结果, +特别是数据分析的可视化结果,等等, +请参阅 [The Jupyter Notebook](http://jupyter-notebook.readthedocs.org/en/latest/notebook.html) 获得更多细节。 + +[开发示例](../notebook/README.md)一节提供了使用Notebook进行编程的信息, +主要是Python 和 R软件包的选择与使用。 diff --git a/userguide/zh/workspace/terminal.md b/userguide/zh/workspace/terminal.md new file mode 100644 index 0000000..f4c30a0 --- /dev/null +++ b/userguide/zh/workspace/terminal.md @@ -0,0 +1,45 @@ +## 终端操作 ## + +终端(Terminal) 操作对于很多开发者来说是必需的, +在终端下几乎可以进行所有的系统管理、开发调试、运行分析任务。 + +在Jupyter Notebook工作面板中, +点击 **New** - **Terminal** 可以创建一个新Terminal并进入其中进行操作, +如调用 vi 编辑文件,运行程序等。 + +Jupyter Notebook的WEB Terminal有个很重要的特点:可以在后台运行。 +即使用户在浏览器中关闭了 Terminal 页面,其仍然在后台运行。 +用户可以在 Jupyter Notebook的工作面板 - **Running** - **Terminals** +中找到正在运行的 Terminal ,点击进入其中,恢复工作场景。 +这对于长运行的作业很有意义。 + +如果长时间没有操作 Terminal,可能会出现操作无反应现象, +刷新页面通常即可激活。 + +如果要执行多个任务,可以打开多个 Terminal, +也可以在一个 Terminal 中使用 [tmux](https://tmux.github.io) 。 + +**补充**: +对于 Chrome 浏览器的用户,终端的字体可能不是等宽的,导致终端的宽度 +会常常变化,可以通过设置 Chrome 的字体来修复这一问题。具体做法是: Chrome -> +设置 -> 显示高级设置 -> 自定义字体 中,将**宽度固定的字体**修改为某种等宽字体, +比如 Droid Sans Mono. + +### 安装软件包 ### + +用户可以在WEB Terminal中安装软件包。基础系统是Ubuntu,用户使用 `apt-get` +命令进行安装。 + +``` +$ apt-get install clang +``` + +如果用户希望未来将工作区保存为镜像,建议清除 `apt-get` 的缓存以减少空间 +占用: + +``` +$ apt-get clean +``` + +有关`apt-get`,可参阅 +[帮助文档](https://help.ubuntu.com/community/AptGet/Howto)。