IPyStata enables the use of Stata together with Python via Jupyter (IPython) notebooks.
Author: Ties de Kok (Personal Page)
PyPi: https://pypi.python.org/pypi/ipystata
Documentation: Example notebook
pip install ipystata
Alternative, get 4.X version from Github:
pip install git+https://github.com/TiesdeKok/ipystata
pip install ipystata --upgrade --force-reinstall
Python 2.7 or 3.x
IPython 3 or 4+ (http://ipython.org/)
Pandas 0.17.x + (http://pandas.pydata.org/) (I recommend to use a distribution like Anaconda)
Recent version of Stata (13+ preferably) (http://www.stata.com/)
IPyStata can communicate with Stata using two different techniques:
The Stata Automation
mode has a richer feature set compared to the Stata Batch mode
.
Unfortunately, Stata only supports Stata Automation
for Stata instances running on a Windows OS.
For Windows Stata Automation
is set as default but it is possible to set it to use Stata Batch mode
instead.
For Unix operating systems (OS X and Linux) it is only possible to use IPyStata in Stata Batch mode
.
Go to your Stata installation directory and either:
cd C:\Program Files (x86)\Stata14
(Obviously change it to your Stata directory)Look up the name of your Stata executable (e.g. StataMP-64.exe
) and in your command window type:StataMP-64.exe /Register
I get a com error
when using IPyStata in Stata Automation
mode?
IPyStata cannot communicate with Stata. This error indicates that the registration of Stata was unsuccessful.
A potential solution: try to register again but make sure to run the CMD window as administrator.
Do I have to register Stata everytime I want to use IPyStata?
No, you only have to register your Stata instance once unless you want to change your Stata installation.
For more detailed instructions see this page.
The Batch mode approach works on Windows, Mac OS, and Linux
The first step is to tell IPyStata where it can find the Stata installation, use the following commands:
In[1]: import ipystata
In[2]: from ipystata.config import config_stata
In[3]: config_stata('Path to your Stata executable')
Note: you need to restart the Jupyter Notebook kernel after setting a new Stata installation!
You can find the Stata executable in the installation directory of Stata, for example:
Windows --> 'C:\Program Files (x86)\Stata14\StataSE-64.exe'
Mac OS X --> '/Applications/Stata/StataSE.app/Contents/MacOS/stataSE'
Linux --> '/home/user/stata14/stata-se'
It is possible to use config_stata
to configure IPyStata to use the Stata Batch Mode
on Windows instead of the default Stata Automation
mode. See the example below:
In[1]: import ipystata
In[2]: from ipystata.config import config_stata
In[3]: config_stata('Path to your Stata executable', force_batch=True)
Note: This is only advisable if you have a portable Stata installation that you cannot register or if you want to use IPyStata on a Windows server.
The Stata Automation
method is in most other cases a better option.
I set the installation directory but IPyStata still does not work?
The new Stata installation is only initialized after a complete kernel restart.
A potential solution: in the Jupyter Notebook clickkernel
-->restart
Do I have to configure my Stata installation everytime I want to use IPyStata?
No, you only have to configure your Stata executable once unless you want to change your Stata installation.
If you use Stata Automation
--> Make sure that you have a registered Stata instance: Windows
If you use Stata Batch Mode
--> Make sure that you have configured your Stata installation: Unix (Linux, Mac OS)
You can use IPyStata using the %%stata
cell magic.
See the basic instructions below or the example notebook.
Example notebook for Mac OS X and Linux users: batch mode notebook
Note: most intermediate files are stored in the .ipython/stata
directory.
If you use Stata Automation
:Several options are included to manage your sessions, see the session manager section.
IPyStata is imported and loaded using import ipystata
.A cell with Stata code is defined by the cell magic %%stata
.
For example:
In[1]: import ipystata
In[2]: %%stata
display "Hello, I am printed in Stata."
Send a Pandas dataframe to be used in the Stata session (Both methods):
-d --data
In[1]: %%stata -d dataframe
Define the DTA version internally, by default set to 114 (Both methods):
-vr --version
In[1]: %%stata -d dataframe -vr 118
Return the dataset from Stata after code execution and load it into a Pandas dataframe (Both methods):
-o --output
In[1]: %%stata -o dataframe
Input Python lists and load them into Stata as macros (Both methods):
-i --input
In[1]: example_list = ['var_1', 'var_2']
In[2]: %%stata -i example_list
display "`example_list'"
Graph will automatically display and multiple graphs are possible (Only for Stata Automation
!):
If you want to show multiple graphs, you have to make sure to use the , name(.., replace)argument in your Stata code.
the order is not guaranteed to be the same as the generation order. Recommended to use the title() argument when showing multiple graphs.
It is possible to prevent graphs from showing using the -nogr
or --nograph
arguments.
If you want a Stata graph as an output of a IPyStata cell you can use the following argument (Only for Stata Batch Mode
!):
-gr --graph
In[1]: %%stata -gr
Note: Graph export is only partially supported by Stata if the OS has no GUI. To work around this problem the figures are shown in PDF-format if you use Stata Batch Mode. See: Statalist
Prevents any output from being shown below the cell (Both methods):
-np --noprint
In[1]: %%stata -np
For inspection purposes it is possible to open the Stata window instead of running it quietly (Both methods):
-os --openstata
In[1]: %%stata -os
Note: this only works on Windows and Mac OS X.
Define a session to execute the code with (Only for Stata Automation
!):
-s --session
In[1]: %%stata -s session_name
(Note: if no session
argument is provided the main session is used.)
Set your Python working directory to the Stata session (Only for Stata Automation
!) :
-cwd --changewd
In[1]: %%stata -cwd
Set code in the cell to run in Mata (Only for Stata Automation
!) :
-m --mata
In[1]: %%stata -m
Retrieve user-defined macros from Stata into a Python dictionary: macro_dict
(Only for Stata Automation
!):
-gm --getmacro
In[1]: %%stata -gm macro_1 -gm macro_2
local macro_1 item1 item2
local macro_2 item3 item4
In[2]: macro_dict['macro_1']
In[3]: macro_dict['macro_2']
Set a working directoy to use while executing this cell (Only for Stata Batch Mode
!) :
-cwd --changewd
In[1]: %%stata -cwd '~/folder'
IPyStata 0.2 introduces the possibility to use many different Stata sessions that by default run in the background. In order to avoid using unnecessary system resources several tools and automatic cleanup routines are included.
Display all active Stata sessions:
In[1]: %%stata
sessions
Reveal all Stata sessions:
In[1]: %%stata
reveal all
Hide all Stata sessions:
In[1]: %%stata
hide all
Close all Stata sessions initiated by IPyStata:
In[1]: %%stata
close
Close all Stata sessions (Warning! This closes all Stata windows):
In[1]: %%stata
close all
Minor improvements for line width in the log files and added support for UTF-8 encoding in Stata files (requires Pandas 1.0+ to work!).
The Stata Automation
method introduced in IPyStata 0.2 only works on Windows, this release adds support for the Mac OS X and Linux operating systems using the Stata Batch Mode
approach.
The execution methods are determined by IPyStata, non-Windows users will automatically use the Stata Batch Mode
technique.
For Windows users the default method is Stata Automation
, but it is possible to configure IPyStata to use the Stata Batch Mode
instead.
After a discussion with James Fielder I decided to overhaul my initial code to have it interact with Stata using Automation instead of the batch mode. This approach is inspired by James his Stata-Kernel, check out the awesome early development version here: https://github.com/jrfiedler/stata-kernel.
Pros:
- Extra functionality:
- Persistent Stata sessions. (Just as-if you were using Stata directly!)
- Multiple Stata sessions in one notebook.
- Allows IPystata to retrieve macros directly from Stata into Python.
- This approach is more idiomatic as it allows for direct interaction with Stata.
- Keeps my Stata magic functionality consistent with the Stata kernel by James Fiedler.
Cons:
- Windows only (Stata Automation is Windows only).
- Requires the user to register their Stata client.
- Requires recent Stata version (13 / 14).
Bug fixes and other improvements:
- Improved the output display functionality:
- Loops should now be displayed correctly.
- Fixed inconsistent white spaces at the begin / end of output.
- Internal file-handling changed to using absolute paths, working directory functionality is now explicitly included in the -cwd argument.
- Package is compatible for both Python 2.7.x and Python 3.x.
- Plots are now supported using the
-gr
or--graph
arguments (added in 0.2.1)- Both IPython 3 and IPython 4 are now supported (added in 0.2.2)
- Fixed error when replacing dataset in Stata + single item to macro now possible (added in 0.2.3)
Todo:
Add an option for non-Windows users that uses the batch mode functionality.- Explore the possibilities of asynchronous Stata code execution using different sessions.
- Improve Stata syntax highlighting.
Experimental support for Stata syntax highlighting is included. CodeMirror does not have a Stata mode, hence the R mode is modified to accomodate Stata code. Setup instructions are below:
Find your notebook package installation folder. For example:
If you are using IPython 3 go to the folder IPython
, for IPython 4 go to the folder notebook
:
C:\Users\*User*\AppData\Local\Enthought\Canopy\User\Lib\site-packages\IPython
C:\Users\*User*\Anaconda\Lib\site-packages\IPython
C:\Users\*User*\AppData\Local\Enthought\Canopy\User\Lib\site-packages\notebook
C:\Users\*User*\Anaconda\Lib\site-packages\notebook
In the IPython
folder (IPython 3 users) go to the following directory:
\IPython\html\static\components\codemirror\mode
In the notebook
folder (IPython 4 users) go to the following directory:
\notebook\static\components\codemirror\mode\
Create a new folder in the "mode" folder called 'stata'
\IPython\html\static\components\codemirror\mode\stata
or
\notebook\static\components\codemirror\mode\stata
Copy stata.js from the ipystata folder (see Github) into the newly created 'stata' folder.
You can then enable syntax highlighting by running the following code in a Jupyter Notebook cell and restarting the kernel:
import ipystata
from ipystata.config import config_syntax_higlight
config_syntax_higlight(True)
If you have questions or experience problems please use the issues
tab of this repository.
You can also e-mail me at t.c.j.dekok [at] tilburguniversity.edu .
MIT - Ties de Kok - 2017
This project is inspired by and based on the excelent work of:
Contributors:
@Pacbard
@bquistorff
This project is not affiliated with or endorsed by Statacorp.
一个是经管实证研究的必备利器,Stata。 一个是触手可及的胶水语言,Python。 爬虫获取数据,祭出神器Python,整理数据跑出漫天星星,亮出宝剑Stata。双剑合璧,驰骋于数据时代。 当时在R语言与应用计量经济学已经提到Stata有一些不足需要弥补。其实在 Stata Conference Chicago 2016上,不仅有人在提及与Web交互绘图的问题,还提及到Reproducible