Python, Anaconda and JupyterLab

Python, Anaconda and JupyterLab

Warning
With domain allowlisting now being a self-service feature, this is the preferred option. For proxy settings in Anaconda in combination with domain allowlisting, please follow this article: https://support.mydre.org/portal/en/kb/articles/proxy-configurations-anaconda-miniconda.

The article below is outdated and will not be updated anymore.

First version: 2021-04-09
Last Update: 2024-11-11
Last Change: Statement about availability of Pip has been adapted to show possibilities.

Within the DRE, it is only possible to use PIP to install packages when using domain allowlisting. The domains that need to be allowlisted can be found in the list of Domains to be whitelisted/allowlisted for known applications under PyPi.

When using IP-allowlisting, only Conda can be used to install packages. First follow the steps below and then use for example Conda Prompt or Jupyter Lab to create and run your code.  To install packages with Conda, use the following command:
  1. conda install <package>
or
  1. conda install -c conda-forge <package>

Basic install

Open Ports Anaconda

In mydre.org, in tab: External Access, add the following rules ( last update: 2022-09-22):

Rule & Description
IP-address
Port
conda.anaconda.org
104.17.92.24
443
conda.anaconda.org
104.17.93.24
443
repo.anaconda.com
104.16.130.3
443
repo.anaconda.com
104.16.131.3
443
docs.conda.io
188.114.96.0
443
docs.conda.io
188.114.97.0
443

Info
ports might change, easy to check with a local computer: open cmd, nslookup <url>
Warning
When using the External Access feature to enable internet access, make sure that the 'Use a proxy server' setting on your virtual machine is turned off. To do this, go to Start >> Settings (icon) >> Network & Internet >> Proxy, and turn of the setting 'Use a proxy server'.

Install Anaconda or Miniconda

You can use Anaconda or its minimal counterpart Miniconda to create and run Python scripts. To install on your virtual machine, go to  https://www.anaconda.com (for Anaconda) or https://docs.conda.io/en/latest/miniconda.html (for Miniconda) and download the install file.
Idea
With the abovementioned ports opened, you can go to these websites directly from your virtual machine. However, for proper logging of what goes into your workspace, we recommend you download these files outside the DRE and upload them to your workspace through the DRE Portal.
Alert
Always make sure you install programs as administrator, to make them available for other users of the same virtual machine. To do this, go to the folder where the install file is located, right-click and choose Run as Administrator.

Install JupyterLab (optional)

JupyterLab is a web-based development environment that can be used for a multitude of programming languages, including Python. It offers Jupyter Notebooks, where you can create and run your code, and document your work at the same time.

Step 1:
Open Anaconda prompt as administrator and run the following line of code:
conda install -c conda-forge jupyterlab
Step 2:
Set the default location for JupyterLab to the Z:-drive by following these two steps:
1. Create a folder Z:\Jupyter Labs (make sure you use this exact name)
2.  Download the appropriate zip-file for Anaconda or Miniconda at the bottom of this article and upload it to your Workspace, then within the virtual machine extract the two files and
      - Put the shortcut file in: C:\Users\Public\Desktop (the Desktop folder may be hidden, just type in the location in your file explorer)
      - If you installed Miniconda, put the icon file in %ALLUSERSPROFILE%\Miniconda3\Menu (again, just type in the location in your file explorer)
InfoThis makes sure that all code and output is automatically created (and thus backed up) on the Z:-drive. It also allows any user to quickly start Jupyter Labs from the desktop.
Alert
This comes at the cost of a bit of performance, because the Z-drive is not as fast as the C-drive. Alternatively, create a Jupyter Labs folder on the C:-drive and change the settings of the shortcut obtained from the zip-file (right-click the shortcut and choose Properties, then change Target and Start-in from Z: to C:).
Step 3 (recommended):
Set your default browser to Google Chrome or Microsoft Edge: go to Start > Settings (icon on the left) > Apps > Default apps and change the setting for Web browser.

Useful tips

Always open Anaconda prompt in admin mode.

Updating

  1. conda update -c conda-forge jupyterlab
  2. conda update -c conda-forge --all
  3. conda update -c conda-forge python

Interesting packages*

  1. Tabulate: conda install -c conda-forge tabulate
  2. OpenPyXL: conda install -c conda-forge openpyxl
  3. numpy: conda install -c conda-forge numpy
  4. pandas: conda install -c conda-forge pandas
  5. matplotlib: conda install -c conda-forge matplotlib
  6. seaborn: conda install -c conda-forge seaborn
  7. Castor: conda install -c conda-forge castorapi
* you can install multiple packages in one go like: conda install -c conda-forge tabulate seaborn castorapi

Jupyter Lab and multiprocessing

Cell 1
  1. %%writefile magic_functions.py
  2. def your_function(f):
  3.     return f

  4. def process_frame(f):
  5.     return f, study_check_v3(f)
Cell2
  1. from tqdm import tqdm
  2. from multiprocessing import Pool
  3. from magic_functions import process_frame

  4. frames_list = [x + batch*y for x in range(0, batch)]
  5. with Pool() as p:
  6.     pool_outputs = list(
  7.         tqdm(
  8.             p.imap(process_frame, frames_list),
  9.             total=len(frames_list)))

  10. print(pool_outputs)
  11. new_dict = dict(pool_outputs)
  12. print("dict: ", new_dict)

    • Related Articles

    • Using Castor in Python

      Introduction anDREa would like to thank Reinier van Linschoten for bringing this package and description to our attention. This is a Python package for interacting with the API of Castor Electronic Data Capture (EDC). The package contains functions ...
    • Proxy configurations Anaconda

      First version: 2022-10-31 Latest version: 2023-11-16 Last change(s): Clarification in the introduction; clarifying instructions; added screenshots. This is a community effort. The article was created by the anDREa Support Team in their spare time. If ...
    • Domains to be whitelisted/allowlisted for known applications

      First version: 2022-11-01 Last updated: 2024-06-07 Last change: Added link to information on extra settings necessary for Stata. Introduction This is a community effort, if you experience issues, see mistakes/updates, or have other applications that ...
    • CBS OpenData

      CBS OpenData & Python & R / RStudio Update: 2024-02-19 Domain to allow list beta-odata4.cbs.nl Instructions how to use the API https://github.com/statistiekcbs/CBS-Open-Data-v4/tree/master basic.py tested working
    • CEDAR

      CEDAR Home page CEDAR https://more.metadatacenter.org/tools-training/orientation CEDAR APIs https://more.metadatacenter.org/tools-training/cedar-api CEDAR API and Python Prerequisites: Create file called: secret.py Add the following line of ...