PGI User’s GuideParallel Fortran, C and C for Scientists and EngineersThe Portland Group STMicroelectronicsTwo Centerpointe DriveLake Oswego, OR 97035

While every precaution has been taken in the preparation of this document, The Portland Group (PGI ), a wholly-owned subsidiary of STMicroelectronics, Inc., makes nowarranty for the use of its products and assumes no responsibility for any errors that may appear, or for damages resulting from the use of the information contained herein.The Portland Group retains the right to make changes to this information at any time, without notice. The software described in this document is distributed under licensefrom STMicroelectronics, Inc. and/or The Portland Group and may be used or copied only in accordance with the terms of the license agreement ("EULA"). No part of thisdocument may be reproduced or transmitted in any form or by any means, for any purpose other than the purchaser's or the end user's personal use without the expresswritten permission of STMicroelectronics, Inc and/or The Portland Group .Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this manual,STMicroelectronics was aware of a trademark claim. The designations have been printed in caps or initial caps.PGF95, PGF90, and PGI Unified Binary are trademarks; and PGI, PGHPF, PGF77, PGCC, PGC , PGI Visual Fortran, PVF, Cluster Development Kit, PGPROF, PGDBG, and ThePortland Group are registered trademarks of The Portland Group Incorporated. PGI CDK is a registered trademark of STMicroelectronics. *Other brands and names are theproperty of their respective owners.PGI User’s GuideCopyright 1998 – 2000 The Portland Group, Inc.Copyright 2000 – 2008 STMicroelectronics, Inc.All rights reserved.Printed in the United States of AmericaFirst Printing: Release 1.7, Jun 1998Second Printing: Release 3.0, Jan 1999Third Printing: Release 3.1, Sep 1999Fourth Printing: Release 3.2, Sep 2000Fifth Printing: Release 4.0, May 2002Sixth Printing: Release 5.0, Jun 2003Seventh Printing: Release 5.1, Nov 2003Eight Printing: Release 5.2, Jun 2004Ninth Printing: Release 6.0, Mar 2005Tenth Printing: Release 6.1, Dec 2005Eleventh Printing: Release 6.2, Aug 2006Twelfth printing: Release 7.0-1, December, 2006Thirteenth printing: Release 7.1, October, 2007Fourteenth printing: Release 7.2, May, 2008Technical support: [email protected]: [email protected]:

ContentsPreface . xixAudience Description . xixCompatibility and Conformance to Standards . xixOrganization . xxHardware and Software Constraints . xxiiConventions . xxiiRelated Publications . xxiv1. Getting Started . 1Overview . 1Invoking the Command-level PGI Compilers . 1Command-line Syntax . 2Command-line Options . 3Fortran Directives and C/C Pragmas . 3Filename Conventions . 3Input Files . 3Output Files . 5Fortran, C, and C Data Types . 7Parallel Programming Using the PGI Compilers . 7Running SMP Parallel Programs . 8Running Data Parallel HPF Programs . 8Platform-specific considerations . 8Using the PGI Compilers on Linux . 9Using the PGI Compilers on Windows . 10Using the PGI Compilers on SUA and SFU . 11Using the PGI Compilers on Mac OS X . 11Site-specific Customization of the Compilers . 12Using siterc Files . 12Using User rc Files . 12Common Development Tasks . 132. Using Command Line Options . 15iii

Command Line Option Overview .Command-line Options Syntax .Command-line Suboptions .Command-line Conflicting Options .Help with Command-line Options .Getting Started with Performance .Using –fast and –fastsse Options .Other Performance-related Options .Targeting Multiple Systems - Using the -tp Option .Frequently-used Options .151516161617181819193. Using Optimization & Parallelization . 21Overview of Optimization .Local Optimization .Global Optimization .Loop Optimization: Unrolling, Vectorization, and Parallelization .Interprocedural Analysis (IPA) and Optimization .Function Inlining .Profile-Feedback Optimization (PFO) .Getting Started with Optimizations .Local and Global Optimization using -O .Scalar SSE Code Generation .Loop Unrolling using –Munroll .Vectorization using –Mvect .Vectorization Sub-options .Vectorization Example Using SSE/SSE2 Instructions .Auto-Parallelization using -Mconcur .Auto-parallelization Sub-options .Loops That Fail to Parallelize .Processor-Specific Optimization and the Unified Binary .Interprocedural Analysis and Optimization using –Mipa .Building a Program Without IPA – Single Step .Building a Program Without IPA - Several Steps .Building a Program Without IPA Using Make .Building a Program with IPA .Building a Program with IPA - Single Step .Building a Program with IPA - Several Steps .Building a Program with IPA Using Make .Questions about IPA .Profile-Feedback Optimization using –Mpfi/–Mpfo .Default Optimization Levels .Local Optimization Using Directives and Pragmas .Execution Timing and Instruction Counting .Portability of Multi-Threaded Programs on Linux .libpgbind .libnuma 939404041424243434444

PGI User’s Guide4. Using Function Inlining . 45Invoking Function Inlining .Using an Inline Library .Creating an Inline Library .Working with Inline Libraries .Updating Inline Libraries - Makefiles .Error Detection during Inlining .Examples .Restrictions on Inlining .45464748484949495. Using OpenMP . 51Fortran Parallelization Directives .C/C Parallelization Pragmas .Directive and Pragma Recognition .Directive and Pragma Summary Table .Directive and Pragma Clauses .Run-time Library Routines .Environment Variables .OMP DYNAMIC .