Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
en:centro:servizos:servidores_de_computacion_gpgpu [2020/05/08 13:19] – [Service description] fernando.guillen | en:centro:servizos:servidores_de_computacion_gpgpu [2024/10/01 17:34] (current) – jorge.suarez | ||
---|---|---|---|
Line 2: | Line 2: | ||
===== Service description ===== | ===== Service description ===== | ||
- | + | ==== Servers | |
- | Five servers | + | |
- | + | ||
- | * '' | + | |
- | * Supermicro X8DTG-D | + | |
- | * 2 x [[http:// | + | |
- | * 10 GB RAM (5 DIMM 1333 MHz) | + | |
- | * 2 x Nvidia GF100 [Tesla S2050] | + | |
- | * Ubuntu 10.04 | + | |
- | * CUDA 5.0 | + | |
- | * '' | + | |
- | * Dell Precision R5400 | + | |
- | * 2 x [[http:// | + | |
- | * 8 GB RAM (4 x DDR2 FB-DIMM 667 MHz) | + | |
- | * 1 Nvidia GK104 [Geforce GTX 680] | + | |
- | * Ubuntu 18.04 operative system | + | |
- | * Slurm (// | + | |
- | * CUDA 9.2 (//Nvidia official repo//) | + | |
- | * Docker-ce 18.06 (//Docker official repo//) | + | |
- | * Nvidia-docker 2.0.3 (//Nvidia official repo//) | + | |
- | * Nvidia cuDNN v7.2.1 for CUDA 9.2 | + | |
- | * Intel Parallel Studio Professional for C++ 2015 (//single license! coordinate with other users!//) | + | |
- | * '' | + | |
- | * PowerEdge R720 | + | |
- | * 1 x [[http:// | + | |
- | * 16 GB RAM (1 DDR3 DIMM 1600MHz) | + | |
- | * Connected to a graphical card extensión box with: | + | |
- | * Gigabyte GeForce GTX Titan 6GB (2014) | + | |
- | * Nvidia Titan X Pascal 12GB (2016) | + | |
- | * Ubuntu 18.04 operative system | + | |
- | * Slurm (// | + | |
- | * CUDA 9.2 (//Nvidia official repo//) | + | |
- | * Docker-ce 18.06 (//Docker official repo//) | + | |
- | * Nvidia-docker 2.0.3 (//Nvidia official repo//) | + | |
- | * Nvidia cuDNN v7.2.1 for CUDA 9.2 | + | |
- | * Intel Parallel Studio Professional for C++ 2015 (//single license! coordinate with other users!//) | + | |
- | * ROS Melodic Morenia (// | + | |
* '' | * '' | ||
* PowerEdge R730 | * PowerEdge R730 | ||
Line 44: | Line 8: | ||
* 128 GB RAM (4 DDR4 DIMM 2400MHz) | * 128 GB RAM (4 DDR4 DIMM 2400MHz) | ||
* 2 x Nvidia GP102GL 24GB [Tesla P40] | * 2 x Nvidia GP102GL 24GB [Tesla P40] | ||
- | * Centos 7.4 | + | * AlmaLinux 9.1 |
- | * Docker 17.09 and nvidia-docker 1.0.1 | + | * Cuda 12.0 |
- | * OpenCV 2.4.5 | + | * **Mandatory use of Slurm queue manager**. |
- | * Dliv, Caffe, Caffe2 and pycaffe | + | |
- | * Python 3.4: cython, easydict, sonnet | + | * HPC cluster servers: [[ en: |
- | * TensorFlow | + | * CESGA servers: [[ en: |
- | * '' | + | |
+ | ==== Restricted access GPU servers | ||
+ | * '' | ||
* PowerEdge R730 | * PowerEdge R730 | ||
* 2 x [[https:// | * 2 x [[https:// | ||
* 128 GB RAM (4 DDR4 DIMM 2400MHz) | * 128 GB RAM (4 DDR4 DIMM 2400MHz) | ||
* 2 x Nvidia GP102GL 24GB [Tesla P40] | * 2 x Nvidia GP102GL 24GB [Tesla P40] | ||
- | * Ubuntu | + | * Ubuntu |
* **Slurm as a mandatory use queue manager**. | * **Slurm as a mandatory use queue manager**. | ||
* ** Modules for library version management **. | * ** Modules for library version management **. | ||
- | * CUDA 9.0 | + | * CUDA 11.0 |
* OpenCV 2.4 and 3.4 | * OpenCV 2.4 and 3.4 | ||
* Atlas 3.10.3 | * Atlas 3.10.3 | ||
Line 70: | Line 36: | ||
* 192 GB RAM memory(12 DDR4 DIMM 2933MHz) | * 192 GB RAM memory(12 DDR4 DIMM 2933MHz) | ||
* Nvidia Quadro P6000 24GB (2018) | * Nvidia Quadro P6000 24GB (2018) | ||
+ | * Nvidia Quadro RTX8000 48GB (2019) | ||
* Operating system Centos 7.7 | * Operating system Centos 7.7 | ||
* Nvidia Driver 418.87.00 for CUDA 10.1 | * Nvidia Driver 418.87.00 for CUDA 10.1 | ||
* Docker 19.03 | * Docker 19.03 | ||
* [[https:// | * [[https:// | ||
- | * '' | + | * '' |
- | * Server | + | * Dell PowerEdge |
- | * 2 processors[[https:// | + | * 2 x [[ https:// |
- | * 192 GB RAM (12 DDR4 DIMM a 2667MHz) | + | * 128 GB RAM |
- | * 2 x Nvidia Tesla V100S 32GB (2019) | + | * 2 x NVIDIA Ampere A100 80 GB |
- | * Operating system Centos | + | * AlmaLinux |
- | * **Slurm as a mandatory use queue manager**. | + | |
- | * ** Modules for library version management **. | + | * '' |
- | * Nvidia Driver 440.64.00 for CUDA 10.2 | + | * PowerEdge |
- | * Docker 19.03 | + | * 2 x [[ https:// |
- | * [[ https:// | + | * 128 GB RAM |
- | * '' | + | * NVIDIA Ampere A100 80 GB |
- | * Dell PowerEdge | + | * Sistema operativo AlmaLinux |
- | * 2 processors | + | |
- | * 192 GB RAM (12 DDR4 DIMM a 2667MHz) | + | |
- | * 2 x Nvidia Tesla V100S 32GB (2019) | + | |
- | * Operating System Centos | + | |
- | * **Slurm as a mandatory use queue manager**. | + | * 256 GB RAM |
- | * ** Modules for library version management **. | + | |
- | * Nvidia | + | |
- | * Docker 19.03 | + | |
- | * [[ https://github.com/NVIDIA/nvidia-docker | + | * '' |
+ | * Servidor Dell PowerEdge R760 | ||
+ | * 2 x [[ https://ark.intel.com/content/www/ | ||
+ | * 384 GB RAM | ||
+ | * 2 x NVIDIA Hopper H100 de 80 GB | ||
+ | * Sistema operativo AlmaLinux 9.2 | ||
+ | * Driver NVIDIA 555.42.06 and CUDA 12.5 | ||
===== Activation ===== | ===== Activation ===== | ||
- | All CITIUS users can access this service, You have to register | + | Not all servers are available |
===== User Manual ===== | ===== User Manual ===== | ||
Line 103: | Line 77: | ||
Use SSH. Hostnames and ip addresses are: | Use SSH. Hostnames and ip addresses are: | ||
- | * ctgpgpu1.inv.usc.es - 172.16.242.91: | ||
- | * ctgpgpu2.inv.usc.es - 172.16.242.92: | ||
- | * ctgpgpu3.inv.usc.es - 172.16.242.93: | ||
- | * ctgpgpu4.inv.usc.es - 172.16.242.201: | ||
- | * ctgpgpu5.inv.usc.es - 172.16.242.202: | ||
+ | * ctgpgpu4.inv.usc.es - 172.16.242.201 | ||
+ | * ctgpgpu5.inv.usc.es - 172.16.242.202 | ||
+ | * ctgpgpu6.inv.usc.es - 172.16.242.205 | ||
+ | * ctgpgpu9.inv.usc.es - 172.16.242.94 | ||
+ | * ctgpgpu10.inv.usc.es - 172.16.242.95 | ||
+ | * ctgpgpu11.inv.usc.es - 172.16.242.96 | ||
+ | * ctgpgpu12.inv.usc.es - 172.16.242.97 | ||
Connection in only possible from inside the CITIUS network. To connect from other places or from the RAI network it is necessary to use the [[https:// | Connection in only possible from inside the CITIUS network. To connect from other places or from the RAI network it is necessary to use the [[https:// | ||
Line 118: | Line 94: | ||
==== Job management with SLURM ==== | ==== Job management with SLURM ==== | ||
- | In '' | + | On servers where there is a queue management software installed |
To send a job to the queue command '' | To send a job to the queue command '' |