This thread is an attempt to summarize and discuss GTX 1080 specific issues.
Most of the below is scraped from elsewhere. Dumping my notes in case anyone else is interested.https://forum.ethereum.org/discussion/2227/cuda-mine
https://devtalk.nvidia.com/default/board/57/
1) Windows 10 setup for GTX 1080 (same for 1070)
- Windows 10 + Anniversary (Link Here) . If problems, run dxdiag and check the second screen for WDDM 2.1.
- Install latest Nvidia drivers, 369.09 and 372.54 (Link Here) . If you hit a problem, wipe the installed drivers with this (Link Here) and reinstall the nvidia drivers
2) 1080 Mining rates
- See P2 clock issues to get these
- At max stock clocks - 5103 MHz Mem clock, 2062 MHz GPU clock
- Ethereum - 23-24 Mh/s with Claymore or Genoil 1.1.7 (perhaps others)
- Ethereum + Sia - 21.4 Mh/s Ethereum + 1430 Sia with Claymore 6.1 -dcr 200 (max payoff as of Week of Aug 21 2016)
- Data here if you like details (Link Here)
- Ethereum - 23-24 Mh/s with Claymore or Genoil 1.1.7 (perhaps others)
3) Known Issue: Compute runs in P2 power state
For prior Nvidia GPUs, this applies (How to change power state link)/ BUT this does not work for Pascal (1070/1080). Or at least, not yet. 1080 stock settings are- P2 = 4608 MHz Mem clock, 1847 MHz GPU clock
- P0 = 5103 MHz Mem clock, 2062 MHz GPU clock
Maxwell (9xx) GPUs also run compute in P2 verse the max specs of P0. To mine at P0 levels on 1080, one solution is to overclock P2 to reach P0 levels. Do at your own risk.
- Use your favorite overclocking tool and add +500 to Mem speed (4608 -> 5103) [or +1000 if you're seeing the 2x numbers 9216 -> 10206]
- Note: as long as you stay in P2/compute, the card isn't really overclocked as it's within max speed numbers.
- This is why you see what looks like crazy overclocking of +1000 in the forums, again as long as you stay in P2 (don't run a game) it should work.
The following command is handy to watch the clocks and power states.
nvidia-smi.exe -i 0 -lms 500 --format=csv --query-gpu=name,driver_version,pstate,clocks.mem,clocks.sm,clocks_throttle_reasons.active,clocks_throttle_reasons.applications_clocks_setting
If you can't reach max spec mem clock of 5013 Mhz with overclocking, you're likely throttling on power, temp, or voltage cap. I maxed out voltage at +100% and power at +50% using Gigabyte tool that came with the card (non-reference design).
4) How fast will the 1080 mine when "fixed" (opinion zone)
On Ethereum mining, the GTX 1070 is roughly 20% faster than the 1080 right out of the box (24 Mh/s verse 21 Mh/s). Yep, sounds broken. The reason is GDDR5 verse GDDR5X (next item). All kinds of numbers are floating around for what the final number might be when Genoil returns from camping to do his magic and optimizes the 1080 to reach full glory, but...Ethereum mining is primarily memory bandwidth limited with this generation of cards. Both the RX 470/480 8GB and 1070 have the same memory speed and bus width and all three mine around 24 Mh/s before overclocking. The 1080 memory bandwidth in 25% better than the 1070 (10 GHz verse 8 GHz with the same memory width). So maybe, maybe, we could see +25% over the 1070 but not 2x. (happy to be proven wrong as I have a 1080 and not a 1070)
5) Known Issue: GDDR5X (1080) isn't performing as well as GDDR5 (1070) on mining
GDDR5x is better on spec, but GDDR5X is new tech and doesn't appear to out shine GDDR5 on mining... yet.Here's some background on the technical details. tldr
- 1070 GDDR5 runs at 2000 MHz and transfers 4 beats of data per cycle = 2000 * 4 = 8 GHz
- 1080 GDDR5X runs at 1251 MHz and transfer 8 beats of date per cycle = 1251 * 8 = 10 GHz
- The 1080 has 25% more spec bandwidth that the 1070, but can we use it?
6) Can the algos be rewritten to run much faster on GDDR5x verse GDDR5? (opinion, unlikely)
I suspect the answer is no. There may be Pascal optimizations, but these will raise both 1070 and 1080 boats.The reason is ETHash works off random reads of 128 byte blocks of memory. The 1070/1080 memory controllers have a 4 byte memory channel. GDDR5 will burst 8 * 4 bytes = 32 bytes and GDDR5X will burst 16 * 4 bytes = 64 bytes. Both of which are smaller than 128 bytes needed by ETHash so GDDR5X doesn't appear to break anything fundamental (such as having a burst length greater then the needed data). It can likely be tuned better once GDDR5X and the Pascal memory coalescing is understood, but it's not obvious that refactoring the algo and creating a 1080 version will make a huge difference. So maybe the 1080 can move from 10% slower than the 1070 to 25% faster, but big gains (> 2x) seem unlikely. (if you want try, see spreadsheet link in next paragraph. There's a link to Genoil's "GTX750Ti and buffers > 1GB on Win7" which is likely a good proxy app to experiment with. There are number for 1080, 980 TI, 970 in there to compare with.)
Same spreadsheet linked from about, see Memory System tab if interested in details.. (Link Here)
7) Can the 1080 be overclocked for much higher mining rates similar to RX 470/480 8GB
Unknown. The performance is coming from increasing the memory bandwidth. The RX 470/480 8GB 30ish numbers require patching the BIOS with tighter GDDR5 timings. Something similar might work with the 1070 as it has a similar GDDR5 memory system (same speed and width). As the 1080 is GDDR5X based and it's new tech we'll have to wait and see...Has anyone tried this?