Python, Anaconda and JupyterLab

Python, Anaconda and JupyterLab

Within the DRE, it is currently not possible to use PIP to install packages. We recommend using Conda instead. First follow the steps below and then use for example Conda Prompt or Jupyter Lab to create and run your code.  To install packages with Conda, use the following command:
  1. conda install <package>
or
  1. conda install -c conda-forge <package>

Basic install

Open Ports Anaconda

In mydre.org, in tab: External Access, add the following rules ( last update: 2022-09-22):

Rule & Description
IP-address
Port
conda.anaconda.org
104.17.92.24
443
conda.anaconda.org
104.17.93.24
443
repo.anaconda.com
104.16.130.3
443
repo.anaconda.com
104.16.131.3
443
docs.conda.io
188.114.96.0
443
docs.conda.io
188.114.97.0
443

ports might change, easy to check with a local computer: open cmd, nslookup <url>
When using the External Access feature to enable internet access, make sure that the 'Use a proxy server' setting on your virtual machine is turned off. To do this, go to Start >> Settings (icon) >> Network & Internet >> Proxy, and turn of the setting 'Use a proxy server'.

Install Anaconda or Miniconda

You can use Anaconda or its minimal counterpart Miniconda to create and run Python scripts. To install on your virtual machine, go to  https://www.anaconda.com (for Anaconda) or https://docs.conda.io/en/latest/miniconda.html (for Miniconda) and download the install file.
With the abovementioned ports opened, you can go to these websites directly from your virtual machine. However, for proper logging of what goes into your workspace, we recommend you download these files outside the DRE and upload them to your workspace through the DRE Portal.
Always make sure you install programs as administrator, to make them available for other users of the same virtual machine. To do this, go to the folder where the install file is located, right-click and choose Run as Administrator.

Install JupyterLab (optional)

JupyterLab is a web-based development environment that can be used for a multitude of programming languages, including Python. It offers Jupyter Notebooks, where you can create and run your code, and document your work at the same time.

Step 1:
Open Anaconda prompt as administrator and run the following line of code:
conda install -c conda-forge jupyterlab
Step 2:
Set the default location for JupyterLab to the Z:-drive by following these two steps:
1. Create a folder Z:\Jupyter Labs (make sure you use this exact name)
2.  Download the appropriate zip-file for Anaconda or Miniconda at the bottom of this article and upload it to your Workspace, then within the virtual machine extract the two files and
      - Put the shortcut file in: C:\Users\Public\Desktop (the Desktop folder may be hidden, just type in the location in your file explorer)
      - If you installed Miniconda, put the icon file in %ALLUSERSPROFILE%\Miniconda3\Menu (again, just type in the location in your file explorer)
This makes sure that all code and output is automatically created (and thus backed up) on the Z:-drive. It also allows any user to quickly start Jupyter Labs from the desktop.
This comes at the cost of a bit of performance, because the Z-drive is not as fast as the C-drive. Alternatively, create a Jupyter Labs folder on the C:-drive and change the settings of the shortcut obtained from the zip-file (right-click the shortcut and choose Properties, then change Target and Start-in from Z: to C:).
Step 3 (recommended):
Set your default browser to Google Chrome or Microsoft Edge: go to Start > Settings (icon on the left) > Apps > Default apps and change the setting for Web browser.

Useful tips

Always open Anaconda prompt in admin mode.

Updating

  1. conda update -c conda-forge jupyterlab
  2. conda update -c conda-forge --all
  3. conda update -c conda-forge python

Interesting packages*

  1. Tabulate: conda install -c conda-forge tabulate
  2. OpenPyXL: conda install -c conda-forge openpyxl
  3. numpy: conda install -c conda-forge numpy
  4. pandas: conda install -c conda-forge pandas
  5. matplotlib: conda install -c conda-forge matplotlib
  6. seaborn: conda install -c conda-forge seaborn
  7. Castor: conda install -c conda-forge castorapi
* you can install multiple packages in one go like: conda install -c conda-forge tabulate seaborn castorapi

Jupyter Lab and multiprocessing

Cell 1
  1. %%writefile magic_functions.py
  2. def your_function(f):
  3.     return f

  4. def process_frame(f):
  5.     return f, study_check_v3(f)
Cell2
  1. from tqdm import tqdm
  2. from multiprocessing import Pool
  3. from magic_functions import process_frame

  4. frames_list = [x + batch*y for x in range(0, batch)]
  5. with Pool() as p:
  6.     pool_outputs = list(
  7.         tqdm(
  8.             p.imap(process_frame, frames_list),
  9.             total=len(frames_list)))

  10. print(pool_outputs)
  11. new_dict = dict(pool_outputs)
  12. print("dict: ", new_dict)

    • Related Articles

    • Using Castor in Python

      Introduction anDREa would like to thank Reinier van Linschoten for bringing this package and description to our attention. This is a Python package for interacting with the API of Castor Electronic Data Capture (EDC). The package contains functions ...
    • Proxy configurations Anaconda / miniConda

      version: 2022-10-31 Introduction For general domain whitelisting setup, please read more here. Certain software might need manual proxy configurations to be set in the software settings, before the software is able to reach domains through a proxy. ...
    • CBS OpenData

      Access To access CBS OpenData from within your Workspace: Add External rule Rule name IP-address Port Remark opendata.cbs.nl 87.213.43.244 443 opendata.cbs.nl Turn on the rule Python & CBS OpenData Download cbsodata....whl from ...
    • CEDAR

      CEDAR Home page CEDAR https://more.metadatacenter.org/tools-training/orientation CEDAR APIs https://more.metadatacenter.org/tools-training/cedar-api CEDAR API and Python Prerequisites: Create file called: secret.py Add the following line of ...
    • Windows-OSDS/1.0.0 Open Source Data Science

      Windows-OSDS/1.0.0 Open Source Data Science VM template This template has been discontinued, it is not up-to-date anymore. Please contact your local Support Team member for institute-specific VM templates. OS Windows Server 2019 Web browsers Chrome ...