General on Simultaneous Multithreading
- "Simultaneous Multithreading: A Platform for Next-Generation Processors". S. Eggers, J. Emer, H. Levy, J. Lo, R. Stamm, D. Tullsen. IEEE Micro, vol. 17, no. 5, IEEE Micro 1997. (ps
- "ILP versus TLP on SMT". N. Mitchell, L. Carter, J. Ferrante, D. Tullsen. Proceedings of the ACM/IEEE conference on Supercomputing, SC 1999. (ps
SMT Implementations
- "Hyper-Threading Technology Architecture and Microarchitecture". D. Marr, F. Binns, D. Hill, G. Hinton, D. Koufaty, J. Miller, M. Upton. Intel Technology Journal, vol.3, issue 1, ITJ 2002. (pdf
- "Initial Observations of the Simultaneous Multithreading Pentium 4 Processor". N. Tuck, D. Tullsen. Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques, PACT 2003. (pdf
- "IBM Power5 Chip: A Dual-Core Multithreaded Processor". R. Kalla, B. Sinharoy, J. Tendler. IEEE Micro, vol. 24, no. 2, IEEE Micro 2004. (pdf
Resource Sharing in SMTs
- "Exploiting Choice: Instruction Fetch and Issue on an Implementable Simultaneous Multithreading Processor". D. Tullsen, S. Eggers, J. Emer, H. Levy, J. Lo, R. Stamm. Proceedings of the 23th annual International Symposium on Computer Architecture, ISCA 1996. (ps
- "Handling Long-Latency Loads in a Simultaneous Multithreading Processor". D. Tullsen, J. Brown. Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture, MICRO 2001. (pdf
- "Front-End Policies for Improved Issue Efficiency in SMT Processors". A. Moursy, D. Albonesi. Proceedings of the 9th International Symposium on High-Performance Computer Architecture, HPCA 2003. (pdf
- "The Impact of Resource Partitioning on SMT Processors". S. Raasch, S. Reinhardt. Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques, PACT 2003. (pdf
- "Dynamically Controlled Resource Allocation in SMT Processors". F. Cazorla, A. Ramirez, M. Valero, E. Fernandez. Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture, MICRO 2004. (pdf
- "Learning-Based SMT Processor Resource Distribution via Hill-Climbing". S. Choi, D. Yeung. Proceedings of the 33rd annual International Symposium on Computer Architecture, ISCA 2006. (pdf
- "Software-Controlled Priority Characterization of POWER5 Processor". C. Boneti, F. Cazorla, R. Gioiosa, A. Buyuktosunoglu, C. Cher, M. Valero. SIGARCH Comput. Archit. News, ACM, 36, 415-426, SIGARCH-CAN 2008. (pdf
- "An Adaptive Resource Partitioning Algorithm for SMT Processors". H. Wang, I. Koren, C. Krishna. Proceedings of the 17th international conference on Parallel Architectures and Compilation Techniques, PACT 2008. (pdf
Helper Threading
- "Simultaneous Subordinate Microthreading (SSMT)". S. Chappell, J. Stark, S. Kim, S. Reinhardt, Y. Patt. Proceedings of the 26th annual international Symposium on Computer Architecture, ISCA 1999. (pdf
- "Speculative Data-Driven Multithreading". A. Roth, G. Sohi. Proceedings of the 7th International Symposium on High-Performance Computer Architecture , HPCA 2001. (pdf
- "Tolerating Latency through Software-Controlled Pre-Execution in Simultaneous Multithreading Processors". C. Luk. Proceedings of the 28th Annual International Symposium on Computer Architecture, ISCA 2001. (ps
- "Speculative Precomputation: Long-range Prefetching of Delinquent Loads". J. Collins, H. Wang, D. Tullsen, C. Hughes, Y. Lee, D. Lavery, J. Shen. Proceedings of the 28th annual International Symposium on Computer Architecture, ISCA 2001. (pdf
- "Speculative Precomputation: Exploring the Use of Multithreading for Latency". H. Wang, P. Wang, R. Weldon, S. Ettinger, H. Saito, M. Girkar, S. Liao, J. Shen. Intel Technology Journal, vol.3, issue 1, ITJ 2002. (pdf
- "Transparent Threads: Resource Sharing in SMT Processors for High Single-Thread Performance". G. Dorai, D. Yeung. Proceedings of the 11th international conference on Parallel Architectures and Compilation Techniques, PACT 2002. (pdf
- "Slipstream Execution Mode for CMP-Based Multiprocessors". K. Ibrahim, G. Byrd, E. Rotenberg. Proceedings of the 9th IEEE International Symposium on High-Performance Computer Architecture, HPCA 2003. (pdf
- "Physical Experimentation with Prefetching Helper Threads on Intel's Hyper-Threaded Processors". D. Kim, J. Shen, S. Liao, P. Wang, J. Cuvillo, X. Tian, X. Zou, H. Wang, D. Yeung, M. Girkar. Proceedings of the international symposium on Code Generation and Optimization: feedback-directed and runtime optimization, CGO 2004. (pdf
- "Supporting Fine-Grained Synchronization on a Simultaneous Multithreading Processor". D. Tullsen, J. Lo, S. Eggers, H. Levy. Proceedings of the IEEE 5th International Symposium on High Performance Computer Architecture, HPCA 1999. (ps
- "Exploiting Fine-Grained Data Parallelism with Chip Multiprocessors and Fast Barriers". J. Sampson, R. Gonzalez, J. Collard, N. Jouppi, M. Schlansker, B. Calder. Proceedings of the International Symposium on Microarchitecture, MICRO 2006. (pdf
Job Scheduling
- "Symbiotic Job Scheduling with Priorities for a Simultaneous Multithreading Processor". A. Snavely, D. Tullsen, G. Voelker. Proceedings of the 2002 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, SIGMETRICS 2002. (pdf
- "Architectural Support for Enhanced SMT Job Scheduling". A. Settle, J. Kihm, A. Janiszewski, D. Connors. Proceedings of the 13th international conference on Parallel Architectures and Compilation Techniques, PACT 2004. (pdf
- "Scheduling Algorithms for Effective Thread Pairing on Hybrid Multiprocessors". R. McGregor, C. Antonopoulos, D. Nikolopoulos. Proceedings of the 19th International Symposium on Parallel and Distributed Processing, IPDPS 2005. (pdf
- "Hyper-threading Aware Process Scheduling Heuristics". J. Bulpin, I. Patt. Proceedings of the USENIX Annual Technical Conference 2005 on USENIX Annual Technical Conference, USENIX 2005. (pdf
- "Dynamic Run-Time Architecture Techniques for Enabling Continuous Optimization". T. Moseley, A. Shye, V. Reddi, M. Iyer, D. Fay, D. Hodgdon, J. Kihm, A. Settle, D. Grunwald, D. Connors. Proceedings of the 2nd conference on Computing Frontiers, CF 2005. (pdf
- "Compatible Phase Co-Scheduling on a CMP of Multi-threaded Processors". A. El-Moursy, R. Garg, D. Albonesi, S. Dwarkadas. Proceedings of the 20th International Symposium on Parallel and Distributed Processing, IPDPS 2006. (pdf
- "Thread Clustering: Sharing-Aware Scheduling on SMP-CMP-SMT Multiprocessors". D. Tam, R. Azimi, M. Stumm. Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems, EuroSys 2007. (pdf
Code Optimizations and Runtime Techniques
- "Tuning Compiler Optimizations for Simultaneous Multithreading". J. Lo, S. Eggers, H. Levy, S. Parekh, D. Tullsen. Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, MICRO 1997. (ps
- "Maximizing TLP with Loop-Parallelization on SMT". D. Puppin, D. Tullsen. 5th Workshop on Multithreaded Execution, Architecture, and Compilation, MTEAC 2001. (ps
- "Code and Data Transformations for Improving Shared Cache Performance on SMT Processors". D. Nikolopoulos. ISHPC, pp. 54-69, ISHPC 2003. (pdf
- "Runtime Support for Integrating Precomputation and Thread-Level Parallelism on Simultaneous Multithreaded Processors". M. Curtis-Maury, T. Wang, C. Antonopoulos, D. Nikolopoulos. Proceedings of the 7th workshop on Workshop on languages, compilers, and run-time support for scalable systems, LCR 2004. (pdf
- "Runtime Empirical Selection of Loop Schedulers on Hyperthreaded SMPs". Y. Zhang, M. Voss. Proceedings of the 19th International Symposium on Parallel and Distributed Processing, IPDPS 2005. (pdf
- "Integrating Multiple Forms of Multithreaded Execution on multi-SMT Systems: A Study with Scientific Applications". M. Curtis-Maury, T. Wang, C. Antonopoulos, D. Nikolopoulos. Proceedings of the Second International Conference on the Quantitative Evaluation of Systems, QEST 2005. (pdf
- "Dynamic Tiling for Effective Use of Shared Caches on Multithreaded Processors". D. Nikolopoulos. International Journal of High Performance Computing and Networking, Vol. 2, pp. 22-35, IJHPCN 2006. (pdf
- "Database Hash-Join Algorithms on Multithreaded Computer Architectures". P. Garcia, H. Korth. Proceedings of the 3rd conference on Computing Frontiers, CF 2006. (pdf
- "Multigrain Parallel Delaunay Mesh Generation: Challenges and Opportunities for Multithreaded Architectures". C. Antonopoulos, X. Ding, A. Chernikov, F. Bagojevic, D. Nikolopoulos, N. Chrisochoides. Proceedings of the 19th annual international conference on Supercomputing, ICS 2005. (pdf
- "Stream Programming on General-Purpose Processors". J. Gummaraju, M. Rosenblum. Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2005. (pdf
SMT Extensions and Variations
- "Mini-Threads: Increasing TLP on Small-Scale SMT Processors". J. Redstone, S. Eggers, H. Levy. Proceedings of the IEEE 9th International Symposium on High Performance Computer Architecture, HPCA 2003. (pdf
- "Balanced Multithreading: Increasing Throughput via a Low Cost Multithreading Hierarchy". E. Tune, R. Kumar, D. Tullsen, B. Calder. Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2004. (pdf