Idgraf
From Digitalis
(Difference between revisions)
(14 intermediate revisions not shown) | |||
Line 1: | Line 1: | ||
- | === Overview === | + | {{Tabs}} |
- | Idgraf is a Bi-Socket Intel Xeon Westmere machine featuring 8 Nvidia GPU, Tesla C2050 | + | ===== Overview ===== |
+ | Idgraf is a Bi-Socket Intel Xeon Westmere machine featuring 8 Nvidia GPU | ||
+ | |||
+ | * 2x Intel Xeon X5650 (Westmere, 6 cores each, total 12 cores) | ||
+ | * 72 GB RAM | ||
+ | * 8x coprocesseur Nvidia Tesla C2050 | ||
See: | See: | ||
* http://www.tyan.com/Barebones_FT72B7015_B7015F72V2R-N825%20%5BBTO%5D | * http://www.tyan.com/Barebones_FT72B7015_B7015F72V2R-N825%20%5BBTO%5D | ||
+ | * http://www.tyan.com/datasheets/FT72-B7015_Datasheet.pdf | ||
+ | * http://www.tyan.com/manuals/FT72-B7015_V2.1.pdf | ||
* http://www.nvidia.com/docs/IO/43395/NV_DS_Tesla_C2050_C2070_jul10_lores.pdf | * http://www.nvidia.com/docs/IO/43395/NV_DS_Tesla_C2050_C2070_jul10_lores.pdf | ||
- | Beware that to support 8 GPU on a bi-socket Nehalem based machine, PCI-e switches are required, limiting the bandwidth to the PCI-e cards. | + | Beware that to support 8 GPU on a bi-socket Nehalem based machine, in addition to the need of 2 IOH connected via QPI buses, 2 PCI-e switches are required (PLXTech PEX8647 here), possibly limiting the bandwidth to the PCI-e cards. |
+ | |||
+ | See: | ||
+ | * http://www.qdpma.com/systemarchitecture/systemarchitecture_qpi.html | ||
+ | * http://www.plxtech.com/products/expresslane/pex8647 | ||
+ | * http://www.plxtech.com/download/file/586 | ||
+ | == How to experiment == | ||
=== Privileged commands === | === Privileged commands === | ||
Line 14: | Line 27: | ||
* sudo /sbin/reboot | * sudo /sbin/reboot | ||
* sudo /usr/bin/schedtool | * sudo /usr/bin/schedtool | ||
+ | * sudo /usr/bin/opcontrol | ||
+ | * sudo /usr/bin/perf | ||
+ | * sudo /opt/likwid/bin/likwid-perfctr | ||
+ | * sudo /opt/likwid/bin/likwid-topology | ||
* sudo /usr/bin/nvidia-smi (please notify other users via the [mailto:digitalis@lists.grid5000.fr digitalis mailing list] if you change parameters on GPUs that will not be reset to default after a reboot, '''e.g. the memory ECC configuration''') | * sudo /usr/bin/nvidia-smi (please notify other users via the [mailto:digitalis@lists.grid5000.fr digitalis mailing list] if you change parameters on GPUs that will not be reset to default after a reboot, '''e.g. the memory ECC configuration''') | ||
* sudo /usr/local/bin/ipmi-reset | * sudo /usr/local/bin/ipmi-reset | ||
+ | * sudo /usr/bin/lstopo | ||
- | + | == System changelog == | |
Currently, the default system is outdated: | Currently, the default system is outdated: | ||
* Debian squeeze | * Debian squeeze | ||
Line 25: | Line 43: | ||
System is to be updated. Help welcome. | System is to be updated. Help welcome. | ||
- | + | == Acknolegment == | |
+ | The idgraf machine was funded by the Mescal and Moais teams of LIG/Inria. |
Current revision as of 09:22, 29 August 2016
Contents |
Overview
Idgraf is a Bi-Socket Intel Xeon Westmere machine featuring 8 Nvidia GPU
- 2x Intel Xeon X5650 (Westmere, 6 cores each, total 12 cores)
- 72 GB RAM
- 8x coprocesseur Nvidia Tesla C2050
See:
- http://www.tyan.com/Barebones_FT72B7015_B7015F72V2R-N825%20%5BBTO%5D
- http://www.tyan.com/datasheets/FT72-B7015_Datasheet.pdf
- http://www.tyan.com/manuals/FT72-B7015_V2.1.pdf
- http://www.nvidia.com/docs/IO/43395/NV_DS_Tesla_C2050_C2070_jul10_lores.pdf
Beware that to support 8 GPU on a bi-socket Nehalem based machine, in addition to the need of 2 IOH connected via QPI buses, 2 PCI-e switches are required (PLXTech PEX8647 here), possibly limiting the bandwidth to the PCI-e cards.
See:
- http://www.qdpma.com/systemarchitecture/systemarchitecture_qpi.html
- http://www.plxtech.com/products/expresslane/pex8647
- http://www.plxtech.com/download/file/586
How to experiment
Privileged commands
Currently, the following commands can be run via sudo in exclusive jobs:
- sudo /usr/bin/whoami (provided for testing the mechanism, should return "root")
- sudo /sbin/reboot
- sudo /usr/bin/schedtool
- sudo /usr/bin/opcontrol
- sudo /usr/bin/perf
- sudo /opt/likwid/bin/likwid-perfctr
- sudo /opt/likwid/bin/likwid-topology
- sudo /usr/bin/nvidia-smi (please notify other users via the digitalis mailing list if you change parameters on GPUs that will not be reset to default after a reboot, e.g. the memory ECC configuration)
- sudo /usr/local/bin/ipmi-reset
- sudo /usr/bin/lstopo
System changelog
Currently, the default system is outdated:
- Debian squeeze
- Cuda 4
- ...
System is to be updated. Help welcome.
Acknolegment
The idgraf machine was funded by the Mescal and Moais teams of LIG/Inria.