SUPPORT THE WORK

# GetWiki

### FLOPS

ARTICLE SUBJECTS
news  →
unix  →
wiki  →
ARTICLE TYPES
feed  →
help  →
wiki  →
ARTICLE ORIGINS
FLOPS
[ temporary import ]
- the content below is remote from Wikipedia
- it has been imported raw for GetWiki
{{Other uses|Flop (disambiguation){{!}}Flop}}{{Use mdy dates|date=May 2015}}{{Refimprove|date=February 2015|reason=introduction is argued, but not sourced}}{| class="infobox"|+ Computer performance! Name! Unit! Value
Kilo->kiloFLOPS| kFLOPS| 103
Mega->megaFLOPS| MFLOPS| 106
Giga->gigaFLOPS| GFLOPS| 109
Tera->teraFLOPS| TFLOPS| 1012
Peta->petaFLOPS| PFLOPS| 1015
Exa->exaFLOPS| EFLOPS| 1018
Zetta->zettaFLOPS| ZFLOPS| 1021
Yotta->yottaFLOPS| YFLOPS| 1024
In computing, floating point operations per second (FLOPS, flops or flop/s) is a measure of computer performance, useful in fields of scientific computations that require floating-point calculations. For such cases it is a more accurate measure than measuring instructions per second.The similar term FLOP is often used for floating-point operation, for example as a unit of counting floating-point operations carried out by an algorithm or computer hardware.

## Floating-point arithmetic

Floating-point arithmetic is needed for very large or very small real numbers, or computations that require a large dynamic range. Floating-point representation is similar to scientific notation, except everything is carried out in base two, rather than base ten. The encoding scheme stores the sign, the exponent (in base two for Cray and VAX, base two or ten for IEEE floating point formats, and base 16 for IBM Floating Point Architecture) and the Significand (number after the radix point). While several similar formats are in use, the most common is ANSI/IEEE Std. 754-1985. This standard defines the format for 32-bit numbers called single precision, as well as 64-bit numbers called double precision and longer numbers called extended precision (used for intermediate results). Floating-point representations can support a much wider range of values than fixed-point, with the ability to represent very small numbers and very large numbers.Floating Point Retrieved on December 25, 2009.

### Dynamic range and precision

The exponentiation inherent in floating-point computation assures a much larger dynamic range â€“ the largest and smallest numbers that can be represented â€“ which is especially important when processing data sets which are extremely large or where the range may be unpredictable. As such, floating-point processors are ideally suited for computationally intensive applications.Summary: Fixed-point (integer) vs floating-point Retrieved on December 25, 2009.

### Computational performance

FLOPS and MIPS are units of measure for the numerical computing performance of a computer. Floating-point operations are typically used in fields such as scientific computational research. The unit MIPS measures integer performance of a computer. Examples of integer operation include data movement (A to B) or value testing (If A = B, then C). MIPS as a performance benchmark is adequate when a computer is used in database queries, word processing, spreadsheets, or to run multiple virtual operating systems.Fixed versus floating point. Retrieved on December 25, 2009.Data manipulation and math calculation. Retrieved on December 25, 2009. Frank H. McMahon, of the Lawrence Livermore National Laboratory, invented the terms FLOPS and MFLOPS (megaFLOPS) so that he could compare the supercomputers of the day by the number of floating-point calculations they performed per second. This was much better than using the prevalent MIPS to compare computers as this statistic usually had little bearing on the arithmetic capability of the machine.FLOPS can be calculated using this equation:"Nodes, Sockets, Cores and FLOPS, Oh, My" by Dr. Mark R. Fernandez, Ph.D.
text{FLOPS} = text{sockets} times frac{text{cores}}{text{socket}} times frac{text{cycles}}{ text{second}} times frac{text{FLOPs}}{text{cycle}}

## FLOPs per cycle for various processors

{{Refimprove section|date=November 2017}}{| class="wikitable sortable"! scope="col" | CPU Family! scope="col" | Double precision! scope="col" | Single precision
Nehalem (microarchitecture)>NehalemTHEORETICAL PEAK FLOPS PER INSTRUCTION SET: A TUTORIAL > FIRST1 = ROMAIN YEAR = 2017 VOLUME=74 PAGES=1341â€“1377 | 8 SP FLOPs/cycle: 4-wide SSE addition + 4-wide SSE multiplication
Sandy Bridge and Intel Ivy Bridge (microarchitecture)>Ivy Bridge 8 DP FLOPs/cycle: 4-wide AVX addition + 4-wide AVX multiplication 16 SP FLOPs/cycle: 8-wide AVX addition + 8-wide AVX multiplication
Haswell (microarchitecture)>Haswell, Intel Broadwell (microarchitecture) and Intel Skylake (microarchitecture)>Skylake 16 DP FLOPs/cycle: two 4-wide FMA instruction set >| 32 SP FLOPs/cycle: two 8-wide FMA instructions
Skylake_(microarchitecture)#%22Skylake-SP%22_(14_nm)_Scalable_Performance>Intel Xeon Skylake (AVX-512) 16 or 32 DP FLOPs/cycle: one or two 8-wide FMA instructions (depends on SKU) 32 or 64 SP FLOPs/cycle: one or two 16-wide FMA instructions (depends on SKU)
| 8 SP FLOPs/cycle: 4-wide SSE addition + 4-wide SSE multiplication
| 16 SP FLOPs/cycle: 8-wide FMA
weblink "each core now has a pair of 128-bit FMA units of its own"HTTPS://WWW.HOTCHIPS.ORG/WP-CONTENT/UPLOADS/HC_ARCHIVES/HC28/HC28.23-TUESDAY-EPUB/HC28.23.90-HIGH-PERFORM-EPUB/HC28.23.930-X86-CORE-MIKECLARK-AMD-FINAL_V2-28.PDF#PAGE=7 AUTHOR=MIKE CLARK PUBLISHER=AMD page 7 >| 16 SP FLOPs/cycle: pair 4-wide FMA instructions
| 8 SP FLOPs/cycle: 4-wide SSE addition + 4-wide SSE multiplication every other cycle
| 4 SP FLOPs/cycle: 4-wide SSE addition every other cycle + 4-wide SSE multiplication every other cycle
| 8 SP FLOPs/cycle: 8-wide AVX addition every other cycle + 8-wide AVX multiplication every other cycle
| 8 SP FLOPs/cycle: 4-wide NEON addition every other cycle + 4-wide NEON multiplication every other cycle
| 8 SP FLOPs/cycle: 4-wide NEON addition every other cycle + 4-wide NEON multiplication every other cycle
| 8 SP FLOPs/cycle: 4-wide NEONv2 FMA or 4-wide NEON multiply-add
| 8 SP FLOPs/cycle: 4-wide NEONv2 FMA or 4-wide NEON multiply-add
| 8 SP FLOPs/cycle: 4-wide NEONv2 FMA or 4-wide NEON multiply-add
ARM Cortex-A53 >| 8 SP FLOPs/cycle: 4-wide NEONv2 FMA or 4-wide NEON multiply-add
ARM Cortex-A57 >| 8 SP FLOPs/cycle: 4-wide NEONv2 FMA or 4-wide NEON multiply-add
| 8 SP FLOPs/cycle: 4-wide NEONv2 FMA or 4-wide NEON multiply-add
| 8 SP FLOPs/cycle: 4-wide NEONv2 FMA or 4-wide NEON multiply-add
| 8 SP FLOPs/cycle: 4-wide NEONv2 FMA or 4-wide NEON multiply-add
8 DP FLOPs/cycle: 4-wide QPX FMA every cycle (SP elements are extended to DP and processed on the same units)
4 DP FLOPs/cycle: 4-wide QPX FMA every other cycle (SP elements are extended to DP and processed on the same units)
Xeon Phi (Knights Corner), per core >| 32 SP FLOPs/cycle: 16-wide FMA every cycle
| 16 SP FLOPs/cycle: 16-wide FMA every other cycle
| 64 SP FLOPs/cycle: two 16-wide FMA instructions
graphics processing unit>GPU Different 2 SP FLOPs/cycle

## Performance records

### Single computer records

, University of Texas at Austin, Texas Advanced Computing Center
, September 13, 2010
, Any researcher at a U.S. institution can submit a proposal to request an allocation of cycles on the system.
, yes
, August 1, 2009
, mdy

### Distributed computing records

Distributed computing uses the Internet to link personal computers to achieve more FLOPS:
• {{As of|2016|10}}, the Folding@home network has over 100 petaFLOPS of total computing power.WEB,weblink Closing in on 100 Petaflops, May 11, 2016, Folding@Home, July 17, 2016, WEB,weblink Folding@home team stats pages, Folding@Home, October 14, 2016, It was the first computing project of any kind to cross the 1, 2, 3, 4, and 5 native petaFLOPS milestones. This level of performance is primarily enabled by the cumulative effort of a vast array of powerful GPU and CPU units.PRESS RELEASE

, Sony Computer Entertainment's Support for Folding@home Project on PlayStationâ„¢3 Receives This Year's "Good Design Gold Award"
, Sony Computer Entertainment Inc.
, November 6, 2008
, December 11, 2008
, yes
, January 31, 2009
, mdy
,
• {{As of|2018|06}}, the entire BOINC network averages about 20 petaFLOPS.WEB,weblink Computering Power, BOINC, June 15, 2018,
• {{As of|2018|06}}, SETI@Home, employing the BOINC software platform, averages 896 teraFLOPS.WEB,weblink SETI@Home Credit overview, BOINC, June 15, 2018,
• {{As of|2018|06}}, Einstein@Home, a project using the BOINC network, is crunching at 3 petaFLOPS.WEB,weblink Einstein@Home Credit overview, BOINC, June 15, 2018,
• {{As of|2018|06}}, MilkyWay@Home, using the BOINC infrastructure, computes at 847 teraFLOPS.WEB,weblink MilkyWay@Home Credit overview, BOINC, June 15, 2018,
• {{As of|2018|06}}, GIMPS, searching for Mersenne primes, is sustaining 306 teraFLOPS.WEB,weblink Internet PrimeNet Server Distributed Computing Technology for the Great Internet Mersenne Prime Search, GIMPS, June 15, 2018,

### Future developments

{{Further information|Exascale computing}}In 2008, James Bamford's book The Shadow Factory reported that NSA told the Pentagon it would need an exaflop computer by 2018.BOOK, The Shadow Factory, James Bamford, James Bamford, Doubleday (publisher), Doubleday, 2008, 339, 978-0-385-52132-1, The Shadow Factory, Given the current speed of progress, supercomputers are projected to reach 1 exaFLOPS (EFLOPS) in 2018.WEB,weblink The Race to a Billion Billion Operations per Second: An Exaflop by 2018?, November 2012, Cray, Inc. announced in December 2009 a plan to build a 1 EFLOPS supercomputer before 2020.WEB,weblink Cray studies exascale computing in Europe, Eetimes.com, February 9, 2012, Erik P. DeBenedictis of Sandia National Laboratories theorizes that a zettaFLOPS (ZFLOPS) computer is required to accomplish full weather modeling of two week time span.BOOK, Reversible logic for supercomputing, Proceedings of the 2nd conference on Computing frontiers, DeBenedictis, Erik P., 2005, ACM Press, New York, NY, 978-1-59593-019-4, 391â€“402,weblink Such systems might be built around 2030.NEWS, IDF: Intel says Moore's Law holds until 2029,weblink Heise Online, April 4, 2008, yes,weblink" title="web.archive.org/web/20131208075357weblink">weblink December 8, 2013, mdy-all,

## Cost of computing

">

### Hardware costs{| class"wikitable"

! Date! Approximate cost per GFLOPS! Approximate cost per GFLOPS ({{Inflation-year|US}} US dollars){{Inflation-fn|US}}!Approximate cost per TFLOPS (2017 US dollars)! Platform providing the lowest cost per GFLOPS ! Comments
| 1961| \$18.7 billion
US1961fmt=c}} billionUS1961fmt=c}} trillion| About 2400 IBM 7030 Stretch supercomputers costing \$7.78 million eachIBM 7030 Stretch performs one floating-point multiply every 2.4 microseconds.HTTP://COMPUTER-HISTORY.INFO/PAGE4.DIR/PAGES/IBM.7030.STRETCH.DIR/ PUBLISHER=NORMAN HARDY, February 24, 2017,
| 1984| \$18,750,000
US1984fmt=c}}|\$44.2 billion| Cray X-MP/48| \$15,000,000 / 0.8 GFLOPS
| 1997| \$30,000
US1997fmt=c}}|\$46,000,000Beowulf (computing)>Beowulf clusters with Pentium Pro microprocessorsHTTP://LOKI-WWW.LANL.GOV/PAPERS/SC97/ >TITLE=LOKI AND HYGLAC DATE=JULY 13, 1997, February 9, 2012, |
US2000fmt=c}} |\$1,440,000Beowulf cluster>Bunyip Beowulf cluster | Bunyip was the first sub-US\$1/MFLOPS computing technology. It won the Gordon Bell Prize in 2000.
US2000|fmt=c}} |\$922,000Kentucky Linux Athlon Testbed>KLAT2WEBSITE=THE AGGREGATE ACCESSDATE=FEBRUARY 9, 2012,
US2003|fmt=c}}|\$109,000| KASY0WEBSITE=THE AGGREGATE ACCESSDATE=FEBRUARY 9, 2012,
US2007|fmt=c}}|\$57,000| MicrowulfACCESSDATE=FEBRUARY 9, 2012 ARCHIVEURL=HTTPS://WEB.ARCHIVE.ORG/WEB/20070912061302/HTTP://WWW.CALVIN.EDU/~ADAMS/RESEARCH/MICROWULF/, September 12, 2007,
US2011fmt=c}} |\$1,980| HPU4Science| This \$30,000 cluster was built using only commercially available "gamer" grade hardware.Adam Stevenson, Yann Le Du, and Mariem El Afrit. "High-performance computing on gamer PCs." Ars Technica. March 31, 2011.
US2012fmt=c}}|\$800| Quad AMD Radeon 7970 GHz SystemAMD Radeon HD 7000 Series>Radeon 7970 desktop computer reaching 16 TFLOPS of single-precision, 4 TFLOPS of double-precision computing performance. Total system cost was \$3000; Built using only commercially available hardware.HTTP://WWW.OVERCLOCK3D.NET/REVIEWS/GPU_DISPLAYS/HD7970_QUADFIRE_EYEFINITY_REVIEW/12 >TITLE=HD7970 QUADFIRE EYEFINITY REVIEW WEBSITE=OC3D.NET, Tom Logan,
US2013fmt=c}}|\$230| Sony PlayStation 4| The Sony PlayStation 4 is listed as having a peak performance of 1.84 TFLOPS, at a price of \$400"Sony Sparks Price War With PS4 Priced at \$399." CNBC. June 11, 2013.
US2013fmt=c}}|\$170| AMD Sempron 145 & GeForce GTX 760 SystemSempron 145 and three Nvidia GeForce 700 series>GeForce GTX 760 reaches a total of 6.771 TFLOPS for a total cost of \$1090.66.HTTP://WWW.FREEZEPAGE.COM/1384601420XCIGYKCBKJ > TITLE=FREEZEPAGE,
US2013fmt=c}}|\$130| Pentium G550 & Radeon R9 290 SystemIntel Sandy Bridge>Pentium G550 and AMD AMD Radeon Rx 200 series tops out at 4.848 TFLOPS grand total of US\$681.84.HTTP://WWW.FREEZEPAGE.COM/1387480124PSLSILVCMJ, FreezePage,
US2015fmt=c}}|\$80| Celeron G1830 & Radeon R9 295X2 SystemHaswell (microarchitecture)>Celeron G1830 and AMD AMD Radeon Rx 200 series tops out at over 11.5 TFLOPS at a grand total of US\$902.57.HTTP://WWW.FREEZEPAGE.COM/1420850340WGSMHXRBLE DATE=2014-04-08,
US2017fmt=c}}|\$60| AMD Ryzen 7 1700 & AMD Radeon Vega Frontier Edition3,000}} for the complete system.HTTPS://MEDIUM.COM/INTUITIONMACHINE/BUILDING-A-50-TERAFLOPS-AMD-VEGA-DEEP-LEARNING-BOX-FOR-UNDER-3K-EBDD60D4A93C>TITLE=BUILDING A 50 TERAFLOPS AMD VEGA DEEP LEARNING BOX FOR UNDER \$3KFIRST=CAROL E.WORK=INTUITION MACHINE, July 26, 2017,
|October 2017|\$0.03
US2017fmt=c}}|\$30|Intel Celeron G3930 & AMD RX Vega 64AMD RX Vega series>AMD RX Vega 64 graphics cards provide just over 75 TFLOPS half precision (38 TFLOPS SP or 2.6 TFLOPS DP when combined with the CPU) at ~\$2,050 for the complete system.HTTPS://PCPARTPICKER.COM/USER/MATTEBAUGHMAN/SAVED/8DQZ8D>TITLE=LOWEST_\$/FP16 - MATTEBAUGHMAN'S SAVED PART LIST - CELERON G3930 2.9GHZ DUAL-CORE, RADEON RX VEGA 64 8GB (3-WAY CROSSFIRE), XON-350_BK ATX MID TOWER - PCPARTPICKERACCESS-DATE=2017-09-13,

{{div col|colwidth=24em}} {{div col end}}{{clear}}

## References

{{Reflist|30em|refs=}}{{Graphics Processing Unit}}{{CPU technologies}}{{Authority control}}

- content above as imported from Wikipedia
- "FLOPS" does not exist on GetWiki (yet)
- time: 5:15pm EDT - Wed, May 22 2019
[ this remote article is provided by Wikipedia ]
LATEST EDITS [ see all ]
GETWIKI 09 MAY 2016
GETWIKI 18 OCT 2015
M.R.M. Parrott
Biographies
GETWIKI 20 AUG 2014
GETWIKI 19 AUG 2014
GETWIKI 18 AUG 2014
Wikinfo
Culture
CONNECT