Reviews & Opinions
Independent and trusted. Read before buy Sony PS-J10!

Sony PS-J10


Bookmark
Sony PS-J10

Bookmark and Share

 

Sony PS-J10About Sony PS-J10
Here you can find all about Sony PS-J10 like manual and other informations. For example: review.

Sony PS-J10 manual (user guide) is ready to download for free.

On the bottom of page users can write a review. If you own a Sony PS-J10 please write about it to help other people.
[ Report abuse or wrong photo | Share your Sony PS-J10 photo ]

 

 

Manual

Preview of first few manual pages (at low quality). Check before download. Click to enlarge.
Manual - 1 page  Manual - 2 page  Manual - 3 page 

Download (English)
Sony PS-J10, size: 1.0 MB
Related manuals
Sony PS-J10 Annexe 1

 

Sony PS-J10

 

 

User reviews and opinions

<== Click here to post a new opinion, comment, review, etc.

Comments to date: 3. Page 1 of 1. Average Rating:
Pro Elite 4:39pm on Saturday, August 21st, 2010 
travel battery charger for digital camera this item is exactly what I have been searching for. Kodak Z950 battery charger As stated previously I found this site EXpro on the net when I discovered what astonishing prices Kodak wanted for the same...
suresh 9:20pm on Thursday, April 29th, 2010 
A Very Complete Charger The device is composed by one standart piece, which is commom to all the chargers by this trademark.
Pheromone 1:20pm on Monday, April 5th, 2010 
Great replacement or back-up I bought this charger having misplaced the original. It was great to have around during my stay in the UK.

Comments posted on www.ps2netdrivers.net are solely the views and opinions of the people posting them and do not necessarily reflect the views or opinions of us.

 

Documents

doc0

3-198-123-14 (1)

Stereo Turntable System

Owners Record

The model number and serial numbers are located at the rear. Record these numbers in the spaces provided below. Refer to these numbers whenever you call upon your Sony dealer regarding this product. Model No. PS-LX300USB Serial No.________________
Operating Instructions PS-LX300USB

2008 Sony Corporation

WARNING
To reduce the risk of fire or electric shock, do not expose this apparatus to rain or moisture. To prevent fire, do not cover the ventilation of the apparatus with news papers, table-cloths, curtains, etc. And dont place lighted candles on the apparatus. To prevent fire or shock hazard, do not place objects filled with liquids, such as vases, on the apparatus. Do not install the appliance in a confined space, such as a bookcase or built-in cabinet. The unit is not disconnected from the AC power source (mains) as long as it is connected to the wall outlet, even if the unit itself has been turned off. Install this system so that the power cord can be unplugged from the wall socket immediately in the event of trouble.
Notice for the customers in the U.S.A.
This symbol is intended to alert the user to the presence of uninsulated dangerous voltage within the products enclosure that may be of sufficient magnitude to constitute a risk of electric shock to persons.
Important Safety Instructions 1) Read these instructions. 2) Keep these instructions. 3) Heed all warnings. 4) Follow all instructions. 5) Do not use this apparatus near water. 6) Clean only with dry cloth. 7) Do not block any ventilation openings. Install in accordance with the manufacturers instructions. 8) Do not install near any heat sources such as radiators, heat registers, stoves, or other apparatus (including amplifiers) that produce heat. 9) Do not defeat the safety purpose of the polarized or grounding-type plug. A polarized plug has two blades with one wider than the other. A grounding type plug has two blades and a third grounding prong. The wide blade or the third prong are provided for your safety. If the provided plug does not fit into your outlet, consult an electrician for replacement of the obsolete outlet. 10)Protect the power cord from being walked on or pinched particularly at plugs, convenience receptacles, and the point where they exit from the apparatus. 11)Only use attachments/accessories specified by the manufacturer. 12)Use only with the cart, stand, tripod, bracket, or table specified by the manufacturer, or sold with the apparatus. When a cart is used, use caution when moving the cart/apparatus combination to avoid injury from tip-over. 13)Unplug this apparatus during lightning storms or when unused for long periods of time. 14)Refer all servicing to qualified service personnel. Servicing is required when the apparatus has been damaged in any way, such as power-supply cord or plug is damaged, liquid has been spilled or objects have fallen into the apparatus, the apparatus has been exposed to rain or moisture, does not operate normally, or has been dropped. WARNING This equipment has been tested and found to comply with the limits for a Class B digital device, pursuant to Part 15 of the FCC Rules. These limits are designed to provide reasonable protection against harmful interference in a residential installation. This equipment generates, uses, and can radiate radio frequency energy and, if not installed and used in accordance with the instructions, may cause harmful interference to radio communications.
This symbol is intended to alert the user to the presence of important operating and maintenance (servicing) instructions in the literature accompanying the appliance. The Caution Marking is put on the Bottom Enclosure.
However, there is no guarantee that interference will not occur in a particular installation. If this equipment does cause harmful interference to radio or television reception, which can be determined by turning the equipment off and on, the user is encouraged to try to correct the interference by one or more of the following measures: Reorient or relocate the receiving antenna. Increase the separation between the equipment and receiver. Connect the equipment into an outlet on a circuit different from that to which the receiver is connected. Consult the dealer or an experienced radio/TV technician for help.

NOTICE FOR THE CUSTOMERS IN THE UNITED KINGDOM A moulded plug complying with BS1363 is fitted to this equipment for your safety and convenience. Should the fuse in the plug supplied need to be replaced, a fuse of the same rating as the supplied one and approved by ASTA or BSI to BS1362, (i.e., marked with or mark) must be used. If the plug supplied with this equipment has a detachable fuse cover, be sure to attach the fuse cover after you change the fuse. Never use the plug without the fuse cover. If you should lose the fuse cover, please contact your nearest Sony service station. Notice for the customers in the countries applying EU Directives The manufacturer of this product is Sony Corporation, 1-7-1 Konan Minato-ku Tokyo, 108-0075 Japan. The Authorized Representative for EMC and product safety is Sony Deutschland GmbH, Hedelfinger Strasse 61, 70327 Stuttgart, Germany. For any service or guarantee matters please refer to the addresses given in separate service or guarantee documents. For the customers in Europe Disposal of Old Electrical & Electronic Equipment (Applicable in the European Union and other European countries with separate collection systems) This symbol on the product or on its packaging indicates that this product shall not be treated as household waste. Instead it shall be handed over to the applicable collection point for the recycling of electrical and electronic equipment. By ensuring this product is disposed of correctly, you will help prevent potential negative consequences for the environment and human health, which could otherwise be caused by inappropriate waste handling of this product. The recycling of materials will help to conserve natural resources. For more detailed information about recycling of this product, please contact your local Civic Office, your household waste disposal service or the shop where you purchased the product.
This equipment has been tested and found to comply with the limits set out in the EMC Directive using a connection cable shorter than 3 meters.
CAUTION You are cautioned that any changes or modification not expressly approved in this manual could void your authority to operate this equipment.
If you have any questions about this product, you may call: Sony Customer Information Services Center 1-800-222-7669 or http://www.sony.com/ Declaration of Conformity Trade Name : SONY Model Name : PS-LX300USB (Stereo Turntable System) Responsible Party : Sony Electronics Inc. Address : 16450 W. Bernardo Dr, San Diego, CA92127 USA Telephone No. : 858-942-2230 This device complies with Part 15 of the FCC Rules. Operation is subject to the following two conditions. (1) This device may not cause harmful interference, and (2) this device must accept any interference received, including interference that may cause undesired operation.

Notice for the customers in Canada This class B digital apparatus complies with Canadian ICES-003.

Table of Contents

About This Manual...4

About This Manual

Thank you for purchasing the Sony Stereo Turntable System. Before operating the unit, please read this manual thoroughly and retain it for future reference. IBM and PC/AT are registered trademarks of International Business Machines Corporation. Microsoft, Windows and Windows Vista are either registered trademarks or trademarks of Microsoft Corporation in the United States and / or other countries. In this manual, Microsoft Windows XP Home Edition and Microsoft Windows XP Professional are referred to as Windows XP. In this manual, Microsoft Windows Vista Home Basic, Windows Vista Home Premium, Windows Vista Business, and Windows Vista Ultimate are referred to as Windows Vista. Sound Forge is a trademark or registered trademark of Sony Creative Software Inc. in the United States and other countries. All other names of systems and products are trademarks or registered trademarks of their respective owners. TM and marks are omitted in this manual.

Getting Started

Unpacking...5 Assembling the Turntable..5 Hooking Up the Turntable.7

Operations

Playing a Vinyl Record...8 Recording Audio Tracks on Your Computer...9

Additional Information

Precautions... 13 Maintenance... 14 Troubleshooting.. 15 Specifications... 18 Parts and Controls... 19

Assembling the Turntable

Unpacking
Check that you received the following items with your turntable: Platter (with drive belt) (1) Rubber mat (1) 45 r/min adaptor (1) USB cable (1) CD-ROM, Sound Forge Audio Studio LE included (1) Operating Instructions (this manual) The installation guide for Sound Forge Audio Studio LE Recording audio tracks of a vinyl record to your computer Please read this first.
1 Move the metallic parts inside the
larger gear in the direction of the arrow.

Continued

2 Carefully place the platter on the

spindle.

To remove the dust cover
With the dust cover fully opened, grasp both sides of the cover, then remove it carefully.

Hinge pocket Hinge

3 Using the ribbon, loop the drive belt

around the motor pulley.

After looping the belt, do not forget to remove the ribbon.
You can use the turntable leaving the dust cover removed. In that case, store the cover correctly.

To install the dust cover
Insert the hinge pockets on the dust cover into the hinges on the rear of the cabinet.
4 Place the rubber mat on the platter.
When the AC power cord is plugged after assembling or moving the turntable, the turntable sometimes rotates and the tone arm descends to the platter even if START is not pressed. If this occurs, press STOP to return the tone arm to the arm stand.

Hooking Up the Turntable

The phono cable comes attached to the rear of the cabinet.
3 Connect the AC power cord.
Connect the AC power cord to an AC wall outlet after completing all of the connections by performing steps 1 and 2.
1 Set the PHONO/LINE switch (on
the rear of the turntable) according to the jacks of your stereo system (amplifier).

to a wall outlet

When your stereo system (amplifier) has PHONO input jacks (connect to PHONO input jacks) set to PHONO. The Equalizer function is off. When your stereo system (amplifier) does not have PHONO input jacks (connect to AUX, VIDEO input jacks, etc.) set to LINE. The Equalizer function is on.
2 Connect the cable with the white
plug to the white (L) jack and connect the cable with the red plug to the red (R) jack.
Make sure to insert the plugs firmly into the jacks. If the plugs are not inserted firmly, noise may occur.

Playing a Vinyl Record

3 Set the SIZE SELECTOR to 17 or 30. 4 Turn the protective cover to expose

the stylus.

5 Close the dust cover. 6 Press START.
The platter starts rotating.
Lower the volume of the amplifier to prevent damage to it. If the tone arm descends and the stylus touches the vinyl record, a loud creaking noise may occur and harm the amplifier or speakers. Adjust the volume of the amplifier after the stylus descends.

45 r/min adaptor STOP

UP/DOWN
1 Place a vinyl record on the platter.
7 Adjust the volume on your amplifier.
When playback is finished
The tone arm returns to the arm stand automatically, then the platter stops rotating.

To stop playing

Press the STOP button.
The tone arm returns to the arm stand. The platter stops rotating.
Place only one vinyl record on the platter at a time. If two or more vinyl records are stacked on the platter, the stylus will not make proper contact with the grooves and the quality of reproduction will be impaired.
2 Press the Speed select button to

select the speed.

To pause playing
Press the UP/DOWN button to raise the stylus off the vinyl record.
To play a different part of the vinyl record

1 After performing step 4, press the UP/
DOWN button, then lift the tone arm. desire.
Recording Audio Tracks on Your Computer
You can record audio tracks of a vinyl record on your computer by: Connecting the turntable and your computer using the supplied USB cable Using the supplied software, Sound Forge Audio Studio LE
2 Move the tone arm to the position you 3 Press the UP/DOWN button.
The tone arm descends to the record, then playback starts.
System requirements for the computer to be connected to the turntable*
Compliant computer: IBM PC/AT or compatible computers Operating systems: Windows Vista Home Basic Windows Vista Home Premium Windows Vista Business Windows Vista Ultimate Windows XP Home Edition Service Pack 2 or higher Windows XP Professional Service Pack 2 or higher Pre-installed (manufacturer installed only)
Operating systems other than those listed above are not supported. 64bit operating systems are not supported. Port: USB port** * Required when recording audio tracks of a vinyl record onto a computer via a USB connection. ** e USB jack of the turntable supports USB (fullTh speed).
To play a 17-cm vinyl record
Place the supplied 45 r/min adaptor on the spindle. When you have finished using the adaptor, put it back in the adaptor tray.

Hardware Environment:

For the operating environment mentioned above, the turntable is not guaranteed to operate with all computers. The turntable is not guaranteed to operate with home built computers, operating systems that are personally up-graded or multiple operating systems. The turntable is not guaranteed to operate with functions such as system suspend, sleep (stand-by mode) and hibernation, on all computers.
Installing the supplied software
Before recording, install the supplied software, Sound Forge Audio Studio LE. Once you install the software into your computer, you do not have to install it again other than that you need to re-install it.
The turntable is not guaranteed to operate with a USB
1 Insert the supplied CD-ROM into the
CD drive of your computer. screen instructions.
hub or USB extension cable. Use the supplied USB cable. Connect the USB cable into the USB jack/USB port securely, or may cause a malfunction. Disconnect the USB cable when it is not in use. When the turntable and computer are connected via USB cable, the audio that is adjusted with the turntable equalizer is input to the USB port of the computer, when playing the turntable to record audio tracks on your computer.

2 Install the software according to onTip
For details about installing the software, refer to the Sound Forge Audio Studio LE installation guide.
USB drivers are included in operating systems if you are using Windows XP or Windows Vista. USB drivers will be installed automatically when the system is connected to the computer for the first time. For details, refer to the manual of your computer.
Setting your computer Connecting the USB cable
Connect the turntable and the computer with the supplied USB cable. Before recording, set the recording audio device according to the operation system you use as follows:
([Control Panel] is in [Category View])

Windows XP

To the USB jack
1 Select [Control Panel] from the [Start]

USB cable (supplied)

2 Click [Sounds, Speech, and Audio

Devices].

To the USB port Computer (not supplied)
3 Click [Sounds and Audio Devices]. 4 Click the [Audio] tab. 5 Select [USB Audio CODEC] for
[Default device:] of [Sound recording].

6 Click [OK].

([Control Panel] is in the [Control Panel Home] view)

Windows Vista

1 Select [Settings] from the [Start]
Recording audio tracks from the turntable on a computer
For details on recording operations, refer to the supplied document titled Recording audio tracks of a vinyl record to your computer. For details about using the software, refer to the Sound Forge Audio Studio LE quick start guide (on the supplied CD-ROM) or on-line help of the software.
2 Select [Control Panel]. 3 Click [Hardware and Sound]. 4 Click [Sound]. 5 Click the [Recording] tab. 6 Select [USB Audio CODEC] for

[Microphone].

7 Click [Set default]. 8 Select [USB Audio CODEC], then click

[Properties].

9 Click the [Advanced] tab. 10 Select [2 channel,.] (ex: [2 channel, 11 Click [OK].
16 bit, 44100 Hz (CD Quality)]) from the [Default Format] drop-down list.
To input stereo signal from the turntable to the computer, performing steps 8 through 11 is necessary.
Reference guides for the supplied software
Refer to the following instructions: The installation guide for Sound Forge Audio Studio LE: Refer to this supplied manual to install Sound Forge Audio Studio LE. The Sound Forge Audio Studio LE tutorial: This tutorial provides easy instructions on how to operate the software. After installing the software, the tutorial appears when you launch the software for the first time. The Sound Forge Audio Studio LE quick start guide (on the supplied CD-ROM): This provides instructions on the basic operation of the software.

On repacking Keep the carton and packing materials. They provide an ideal container to transport the unit.
If you have any question or problem concerning your unit that is not covered in this manual, please consult your nearest Sony dealer.

Precautions

On safety Before operating the unit, check that the operating voltage of your unit is identical with that of your local power supply. Should any solid object or liquid fall into the cabinet, unplug the unit and have it checked by qualified personnel before operating it any further. Unplug the unit from the wall outlet if it is not to be used for an extended period of time. To disconnect the cord, pull it out by the plug. Never pull the cord itself. On placement Place the unit on a level surface. Avoid placing the unit near electrical appliances (such as a television, hair dryer, or fluorescent lamp) which may cause hum or noise. Place the unit where it will not be subject to any vibration, such as from speakers, slamming of doors, etc. Keep the unit away from direct sunlight, extremes of temperature, and excessive dust and moisture.

Note on recording

The recorded music is limited to private use only. Use of the music beyond this limit requires permission of the copyright holders.

Maintenance

Stylus and record care
In order to prevent premature stylus and record wear, the stylus and record should be cleaned before playback. To clean the stylus, brush it from back to front using a good quality stylus cleaning brush. Do not clean the stylus with your finger tip. When using a fluid stylus cleaner, make sure not to moisten the stylus too much.

Replacing the stylus

The life expectancy of the stylus tip is about 500 hours. To preserve good sound quality and avoid damage to your records, we recommend replacing the stylus within this time limit. For a replacement stylus, consult your nearest Sony dealer.

To remove the stylus

1 Turn off and unplug the AC power cords of
the turntable and amplifier.
2 Protect the stylus with the stylus cover.

Stylus cover

3 Grasp the stylus holder and pull it
To clean your records, wipe thoroughly using a good quality record cleaner.
downward away from the body of the cartridge/headshell as shown.

Cartridge/headshell

Cleaning the cabinet and dust cover
Clean the cabinet and dust cover periodically using a soft dry cloth. If stains are difficult to remove, use a cloth moistened with a mild detergent solution. Do not use solvents such as alcohol, benzine or thinner, since they will damage the finish.

Stylus holder

To install the stylus
Do this procedure with the stylus protected by the stylus cover.

Troubleshooting

Before going through the check list below, first make sure that: The power cord is securely connected. The speaker cords are securely connected. Should any problem persist after you have made these checks, consult your nearest Sony dealer.
1 Grasp both sides of the stylus holder, then
insert the stylus grip into the cartridge/ headshell receptacle.

Stylus grip

Playing a vinyl record
The tone arm skips, skates or does not advance. The turntable is not level. Place the turntable on a level surface. The record is dirty or scratched. Clean the record with a commercially available record cleaning kit, or replace the record. Poor sound quality, excessive noise, intermittent sound, etc The stylus is dirty or worn. Remove dust on the stylus using a stylus cleaning brush, or replace the worn stylus (see page 14). Dust or dirt has collected on the vinyl record. Clean the record using a good quality record cleaner.
2 Push up the stylus holder until it clicks so
that it locks completely.

Do not leave any space.

Do not push the stylus cover forcefully. Otherwise, the exposed stylus from the cover may cause injury, or damage the stylus.
Rumble or low-frequency howl*. The turntable is placed too close to speakers. Move speakers away from the turntable.
* This phenomenon, called acoustic feedback, occurs when vibrations from the speakers are transmitted through the air or via solid objects (such as shelves, a cabinet, or the floor) to the turntable where they are picked up by the stylus, amplified and reproduced through the speakers.

USB connection/recording

The turntable is not detected on your computer. Disconnect the supplied USB cable, and connect it again. With the turntable and the computer connected, restart the computer. Disconnect the supplied USB cable, and restart the computer. After restarting, connect the computer and the turntable with the supplied USB cable. The device setting of your computer may not be set correctly.
Tempo is incorrect. Incorrect r/min. Set the r/min setting to match the one indicated on the vinyl record. (Select 33 for 33 1/3 r/min records or 45 for 45 r/min records.) The drive belt is deteriorated. Replace the drive belt. For details, consult your nearest Sony dealer. Platter does not rotate. Make sure the power cord is inserted all the way into an AC wall outlet. Make sure that the drive belt is looped around the motor pulley completely. The drive belt is broken. Replace the drive belt. For details, consult your nearest Sony dealer. Sound is too low or distorted. The turntable is not connected to the PHONO IN inputs on the amplifier (see page 7).

If the turntable is connected to the computer with the USB cable for the first time, the USB Composite Device, HID-compliant consumer control device, USB Human Interface Device and USB Audio Device (Windows XP)/USB Audio CODEC (Windows Vista) are installed automatically. To confirm if the driver is correctly installed, check as follows:
2 Click [Performance and Maintenance]. 3 Click [System]. 4 Click the [Hardware] tab, and then click

[Device manager].

5 Check the [Device manager] screen.
Check the devices installed as follows. [USB Human Interface Devices] and [HIDcompliant consumer control device] under [Human Interface Devices] [USB Audio Device] under [Sound, video and game controllers] [USB Composite Device] under [Universal Serial Bus controllers]
For details on the operation of your computer, refer to There can be some breakdowns that cannot be solved
1 Select [Settings] from the [Start] menu. 2 Select [Control Panel]. 3 Select [System and Maintenance]. 4 Select [Device Manager]. 5 Check the [Device manager] screen.
Check the devices installed as follows. [USB Human Interface Devices] and [HIDcompliant consumer control device] under [Human Interface Devices] [USB Audio CODEC] under [Sound, video and game controllers] [USB Composite Device] under [Universal Serial Bus controllers]
the operating instructions of your computer.
even with Troubleshooting. In such cases, contact your nearest Sony dealer.
When connecting to different USB ports, you may When connecting to different USB ports, the
have to install USB drivers.
computer automatically installs the driver again. In such case, check that the driver is correctly installed as in this procedure.
The recorded sound is disrupted. The CPU of your computer is overloaded. Exit other applications. Other USB devices are connected to your computer and being operated simultaneously. Quit operating other USB devices. You cannot record sound from the turntable. The audio recording device of your computer is not set correctly. See pages 10 and 11 to check the device settings. There is noise in recorded sound. There are electrical wires, fluorescent lights or mobile phones near the turntable. Move away from any possible sources of electromagnetic interference.

Specifications

Motor and Platter
Drive system: Belt-drive Motor: DC motor Platter: 295 mm dia. (aluminum die-cast) Speeds: 33 1/3 and 45 r/min, 2 speed Wow and flutter: Less than 0.25% (WRMS) Signal to noise ratio: More than 50 dB (DIN-B)

General

Tone Arm
Type: Dynamic balanced straight-shaped with soft damping control Effective arm length: 195 mm

USB jack

Power requirements: North American model: 120 V AC, 60 Hz Other models: V AC, 50/60 Hz Power consumption: 2W Dimensions: Approx. 360 mm (161/2 35/8 141/4 in) (w/h/d) Mass: 3.3 kg (7 lb 5 oz) Supplied Accessories: 45 r/min adaptor (1) Platter (with drive belt) (1) Rubber mat (1) USB cable (1) CD-ROM (1) Operating Instructions (this manual) The installation guide for Sound Forge Audio Studio LE Recording audio tracks of a vinyl record to your computer Please read this first. Design and specifications are subject to change without notice.
Power supply: USB bus power compliant (5 V, 100 mA) (The power is supplied by the PC which is connected with an attached USB cable) Output jack: Plug-in-power system (Dedicated USB jack) USB series B connector USB (full-speed)

Parts and Controls

Spindle r/min adaptor 45 Speed select button Rubber mat Platter Cartridge and headshell START button Insulator Dust cover Hinge Tone arm Arm stand Finger lift SIZE SELECTOR STOP button UP/DOWN button PHONO/LINE switch USB jack

Printed in China

doc1

A Rough Guide to Scientic Computing On the PlayStation 3
Technical Report UT-CS-07-595 Version 1.0
by Alfredo Buttari Piotr Luszczek Jakub Kurzak Jack Dongarra George Bosilca Innovative Computing Laboratory University of Tennessee Knoxville May 11, 2007

Contents

Introduction

Hardware

2.1 CELL Processor. 2.1.1 POWER Processing Element (PPE). 2.1.2 Synergistic Processing Element (SPE). 2.1.3 Element Interconnection Bus (EIB). 2.1.4 Memory System.
2.2 PlayStation 3. 2.2.1 Network Card. 2.2.2 Graphics Card.
2.3 GigaBit Ethernet Switch. 2.4 Power Consumption.

Software

3.1 Virtualization Layer: Game OS. 3.2 Linux Kernel. 3.3 Compilers.

CONTENTS

3.4 TCP/IP Stack. 3.5 MPI.

Cluster Setup

4.1 Basic Linux Installation. 4.2 Linux Kernel Recompilation.
4.3 IBM CELL SDK Installation. 4.4 Network Conguration. 4.5 MPI Installation. 4.5.1 MPICH1. 4.5.2 MPICH2. 4.5.3 Open MPI.

Development Environment

5.1 CELL Processor. 5.2 PlayStation 3 Cluster.

Programming Techniques

6.1 CELL Processor. 6.1.1 Short Vector SIMDization.
6.1.2 Intra-Chip Communication. 6.1.3 Basic Steps of CELL Code Development. 6.1.4 Quick Tips.
6.2 PlayStation 3 Cluster.

Programming Models

7.1 CorePy. 7.2 Octopiler. 7.3 RapidMind. 7.4 PeakStream.
7.5 MPI Microtask. 7.6 Cell Superscalar. 7.7 The Sequoia Language.
7.8 Mercury Multi-Core Framework
7.9 IBM Accelerated Library Framework.

UT Knoxville

Application Examples
8.1 CELL Processor. 8.1.1 Dense Linear Algebra. 8.1.2 Sparse Linear Algebra. 8.1.3 Fast Fourier Transform. 8.2 PlayStation 3 Cluster. 8.2.1 The SUMMA Algorithm. 8.3 Distributed Computing. 8.3.1 Folding@Home.

Summary

9.1 Limitations of the PS 3 for Scientic Computing. 9.2 CELL Processor Resources. 9.3 Future.

Acronyms

Acknowledgements
We would like to thank Gary Rancourt and Kirk Jordan at IBM for taking care of our hardware needs and arranging for nancial support. We are thankful to numerous IBM researchers for generously sharing with us their CELL expertise, in particular Sidney Manning, Daniel Brokenshire, Mike Kistler, Gordon Fossum, Thomas Chen, Jason Dale and Michael Perrone. Our thanks also go to Robert Cooper and John Brickman at Mercury Computer Systems for providing access to their hardware and software. We are also thankful to the Mercury research crew for sharing their CELL experience, in particular John Greene, Michael Pepe and Luke Cico. In particular we are grateful to the following people for devoting their time to a carefully review the work and help us improve it: Robert Cooper, Sidney Manning, Jason Dale and Joseph Czechowski (GE Research). We thank Chris Mueller from Indiana University for contributing section 7.1 about synthetic programming on the CELL in Python using CorePy. Thanks to Adelajda Zareba for the photography artwork for this guide.

CHAPTER 1

As much as the Sony PlayStation 3 (PS3) has a range of interesting features, its heart, the CELL processor is what the fuss is all about. CELL, a shorthand for CELL Broadband Engine Architecture, also abbreviated as CELL BE Architecture or CBEA, is a microprocessor jointly developed by the alliance of Sony, Toshiba and IBM, known as STI. The work started in 2000 at the STI Design Center in Austin, Texas, and for more than four years involved around 400 engineers and consumed close to half a billion dollars. The initial goal was to outperform desktop systems, available at the time of completion of the design, by an order of magnitude, through a dramatic increase in performance per chip area and per unit of power consumption. A quantum leap in performance would be achieved by abandoning the obsolete architectural model where performance relied on mechanisms like cache hierarchies and speculative execution, which those days brought diminishing returns in performance gains. Instead, the new architecture would rely on a heterogeneous multi-core design, with highly efcient data processors being at the heart. Their architecture would be stripped of costly and inefcient features like address translation, instruction reordering, register renaming and branch prediction. Instead they would be given powerful short vector SIMD capabilities and a massive register le. Cache hierarchies would be replaced by small and fast local memories and powerful DMA engines. This design approach resulted in a 200 million transistors chip, which today delivers performance barely approachable by its billion transistor counterparts and is available to the broad computing community in a truly off-the-shelf manner via a $600 gaming console. 1

CHAPTER 2. HARDWARE

2.1. CELL PROCESSOR
Figure 2.1: CELL Broadband Engine architecture [1].
The PPE is a 64-bit, 2-way simultaneous multithreading (SMT) processor binary compliant with the PowerPC 970 architecture. Although it uses the PowerPC 970 instruction set, its design is substantially different. It has a relatively simple architecture with in-order execution, which results in considerably smaller amount of circuitry than its out-of-order execution counterparts and lower energy consumption. This can potentially translate to lower performance, especially for applications heavy in branches. However, the high clock rate, high memory bandwidth and dual threading capabilities may make up for the potential performance deciencies. Especially important is the SMP feature, which to an extent corresponds to Intels HyperThreading technology. The PPE seems to provide two independent execution units to the software layer. In practice the execution resources are shared, but each thread has its own copy of the architectural state, such as general-purpose registers. The technology comes at a 5% increase in the cost of the hardware and can potentially deliver from 10% to 30% increase in performance [2].
Clocked at 3.2 GHz, the PPE can theoretically deliver 2 3.2 = 6.4 Gop/s of IEEE compliant double precision oating-point performance from its standard fully pipelined oating point unit using fused multiply-add (FMA) operation. It can also deliver 3.2 = 25.6 Gop/s of non-IEEE compliant single precision oating-point performance from its VMX unit using 4-way SIMD fused multiply-add operation. Although clocked at 3.2 GHz PPE looks like a quite potent processor, its main purpose is to serve as a controller and supervise the other cores on the chip. Thanks to the PPEs compliance with the PowerPC architecture, existing applications can run on the CELL out of the box, and be gradually optimized for performance using the SPEs (see next section), rather than written from scratch.
Synergistic Processing Element (SPE)
The real power of the CELL processor does not lie in the PPE, but the other cores, eight of which are available in the current chip design, but only six are enabled in the PlayStation 3. In the PlayStation 3, one is disabled for wafer yield reasons (If one is defective it is disabled, if none are defective a good one is disabled). The other is held hostage by the OS virtualization layer, the hypervisor, for internal purposes. These cores were originally named Attached Processing Units (APUs) (code name Seneca), and later Supplemental Processing Units (SPUs). At some point the name Streaming Processing Units was used, and eventually the name Synergistic Processing Units was adopted. The SPU is accompanied by 256 KB of local memory for both code and data referred to as local store (LS), and Memory Flow Controller (MFC). All those components together are referred to as the Synergistic Processing Element (SPE). Figure 2.2 shows the structure of the SPE. The SPEs can only execute code residing in the local store and only operate on data residing in the local store. To the SPE the local store represents a at 18-bit address space. Code and data can be moved between main memory and the local store through the internal bus (see next section) using Direct Memory Access (DMA) capabilities of the Memory Flow Controller. In principle the SPEs constitute a small distributed memory system on a chip, where data motion is managed with explicit messaging. The DMA facilities enable perfect overlapping of communication and computation, where some data is being processed while other is in ight. The SPEs are short vector SIMD workhorses of the CELL. They possess a large 128-entry 128-bit vector register le, and a range of SIMD instructions which can operate simultaneously on 2 double precision values, 4 single precision values, 8 16-bit integers or 16 8-bit chars. Most of instructions are pipelined and can complete one vector operation in each clock cycle. This includes fused multiplication-addition in single precision, which means that two oating point operations can be accomplished on four values in each clock cycle, what translates to the peak of 3.2 = 25.6 Gop/s for each SPE, and adds up to the staggering peak of

H Move both RPMs to the PlayStation 3 and install them by issuing as root: $ rpm -i libspe-1.2.0-0.ppc.rpm $ rpm -i libspe2-2.0.1-1.ppc.rpm
After this step the installation is complete and the PlayStation 3 is ready to run your CELL code. Do not hesitate, however, to browse through the CELL repositories of the Barcelona Supercomputer Center. It is an excellent resource for valuable CELL code and documentation. During the work on this document, IBM has released an updated version of the development toolkit, the SDK 2.1, with includes many improvements and new components and requires Linux Fedora Core 6. Installation instructions are almost the same as for the previous SDK 2.0.

4.4 Network Conguration

In principle, conguring a cluster made of PS3 nodes isnt any different than conguring a cluster made of other kind of nodes. Network conguration consists of a few simple steps that are well known to anybody that has built a cluster at least once. Different approaches may be followed when building a cluster depending on the size of the cluster itself, the availability of resources and of software. Our choice is to put the cluster nodes behind a front-end machine (a regular Linux box). This will give us a number of advantages such as:
H Better security. Security policies can be enforced on the front-end and, since access to the
nodes is only possible through the front-end, the whole cluster can be hidden from the rest of the network.
H Cluster symmetry. All the services needed by the cluster can be run on the front-end instead
of one of the nodes. Services include shared volumes, Network Information Service, batch job schedulers, etc. The front-end node must have two network cards; one will serve as an interface to the external network allowing users to remotely connect to the cluster, and the other will be connected to the internal cluster network. In the cluster internal network, each node, including the front-end, will have a static IP address. The network interface conguration can easily be done under Fedora Core Linux by editing the /etc/sysconfig/network-scripts/ifcfg-eth0 le on each node;
this script will congure both the network interface and the routing table. The content of this le is shown in Figure 4.3.

DEVICE=eth0 BOOTPROTO=static HWADDR=xx:xx:xx:xx:xx:xx IPADDR=192.168.1.10 NETMASK=255.255.255.0 NETWORK=192.168.1.0 BROADCAST=192.168.1.255 ONBOOT=yes NAME=eth0
Figure 4.3: The /etc/sysconfig/network-scripts/ifcfg-eth0 script.
Once the /etc/sysconfig/network-scripts/ifcfg-eth0 le is lled with the proper information, it is only necessary to edit the /etc/resolv.conf and the /etc/hosts les in order to set a name server (DNS) and a list of the nodes hostnames. Since these les will be the same on every node of the cluster, they can be created on one and then copied to the others. Finally, on each node, the hostname must be set with the command $hostname node01 (where node01 must be replaced with the proper hostname). An easier and more scalable approach would be to use a router that is capable of doing static DHCP. Once the router is congured, each node will acquire the IP address, hostname, and DNS server directly from the router through the DHCP protocol. In this case the content of the
/etc/sysconfig/network-scripts/ifcfg-eth0 le becomes that in Figure 4.4.
DEVICE=eth0 BOOTPROTO=dhcp HWADDR=xx:xx:xx:xx:xx:xx ONBOOT=yes NAME=eth0
Figure 4.4: The /etc/sysconfig/network-scripts/ifcfg-eth0 script when DHCP is used.
Once the network is congured, it is possible to set up the services. It is very important that all the users and user groups appear on every node with the same name and ID (UID for users and GID for groups). This can be done manually but it is very inconvenient. In fact, every time a new user must be added, his account has to be manually created on each node of the cluster. An easier and much scalable solution is to use the Network Information Service (a good alternative is LDAP). The NIS server has to be installed on the front-end node and a client will be UT Knoxville 23 ICL

4.5. MPI INSTALLATION

installed on each node. A good guide on how to set up the NIS server and the client can be found at
http://tldp.org/HOWTO/NIS-HOWTO/index.html. Once the NIS server and clients are set up,
new users need only be added to the front-end node (with a slightly different procedure than usual) and they will be acquired by the nodes through the NIS service. Another important service that is almost mandatory to set up in a cluster is the network le system. Such service provides shared disk volumes to the cluster users. These volumes (that usually correspond to the whole users home directory) are visible from any cluster node, front-end included, and this means that if, for example, a le in this volume is edited, the changes do not have to be replicated on each node. There are many network le systems available, but the most common choice for small clusters is NFS. As for the NIS service, the NFS server has to be installed on the front-end and the clients on each node. The fact that the NFS server is on the front-end node also means that the shared volume has to physically be hosted there. Thus, the front-end must be equipped with a fast hard disk that is big enough to provide a reasonable quota to each user on the system. A good guide for the conguration of the NFS service can be found at http://tldp.org/

H --enable-fast speeds up the build process for common conguration options H --disable-f77 disables FORTRAN 77 bindings. H --disable-f90 disables FORTRAN 90 bindings. H --disable-cxx disables C++ bindings. H --disable-romio disables ROMI/O a well known implementation of the MPI I/O part of
H --disable-threads=single disables threading support: only a single threaded applications will be allowed to link against the resulting library

Open MPI

Open MPI was created when developers of FT-MPI, LA-MPI, LAM/MPI, and PACX-MPI joined their efforts and put together their experiences and invited contributors from other academic institutions, government labs, and vendors. It is a modern MPI implementation that is fully compliant with the MPI2 standard. It is available from http://www.open-mpi.org/. The older release (1.1.x series) is still available but users are strongly encouraged to download at least version 1.2 or even the nightly snapshots of the upcoming 1.3 release, as they are very stable. Installation of Open MPI is also very simple. There is an option for enabling a set of features at conguration time like detailed debugging support (so that the application stack trace will be printed
Build with extensive debugging support:
$./configure --with-platform=ps3 --enable-debug --enable-picky $ make -j2 $ make install
Build without debugging support:
$./configure --with-platform=ps3 --disable-debug --enable-picky $ make -j2 $ make install
Figure 4.7: The commands we used to set up, compile and build two congurations of the Open MPI library.
upon program failure without the need for a debugger) and threading support (to support threaded applications and have a progress thread inside the library for better support for non-blocking communication). Launching of jobs is also simple. The start up is very fast with use of daemons (similarly to MPICH2) but the daemons are managed transparently by Open MPI without requiring any user intervention. We tested two different builds of Open MPI:
H Version with debugging symbols and H Version without debugging symbols.
The exact optioned that were used for both of these versions are given in Figure 4.7. For completeness, here are the Open MPI equivalents of the options we mentioned while describing both MPICH implementations (with default values given in parentheses) as well as additional options that are specic to Open MPI:
H --disable-mpi-f77 (default: enabled) disables FORTRAN 77 bindings. H --disable-mpi-f90 (default: enabled) disables FORTRAN 90 bindings. H --disable-mpi-profile (default: enabled) disables MPIs proling interface. H --disable-mpi-cxx (default: enabled) disables C++ bindings. H --disable-mpi-cxx-seek (default: enabled) disables some C++ bindings related to I/O. H --enable-mpi-threads (default: disabled) enable threading support in the Open MPI library.

H --enable-progress-threads (default: disabled) enable use of a separate thread for handling asynchronous communication.

CHAPTER 5

The CELL SDK provides the development environment for programming the CELL processor. The current release of the SDK is a great package offering a range of software tools including: a suite of compilers and debuggers, SIMD math libraries, parallel programming frameworks, full system simulator and Eclipse IDE. As valuable as all those components are, the compiler suite is of the greatest importance to get you started running code on the CELL processor. Since the PPE and the SPEs are different architectures with disjoint address spaces, they require two distinct tool-chains for software development. The most important component of the tool-chain is the compiler suite including both the GNU GCC and the IBM XLC compilers. The compilers for both architectures produce object les in the standard Executable and Linking Format (ELF). A special format, CBEA Embedded SPE Object Format (CESOF), allows SPE executable object les to be embedded inside PPE object les. Standard compilation steps include:
H Compilation of the SPU source code using either the GNU spu-gcc compiler or the IBM spuxlc

compiler.

H Embedding of the SPU object code using the ppu-embedspu utility.
CHAPTER 5. DEVELOPMENT ENVIRONMENT

5.1. CELL PROCESSOR

H Conversion of the embedded SPU code to an SPU library using the ppu-ar utility. H Compilation of the PPU source code using either the GNU ppu-gcc compiler or the IBM ppuxlc
H Linking the PPU code with the library containing the SPU code and with the libspe library, to
produce a single CELL executable le. For example, given two source les ppu code.c and spu code.c, containing the PPU code and SPU code accordingly, the executable cell prog can be built with the toolchain 3.3 by using the rudimentary makele from Figure 5.1.
TOOLCHAIN = /opt/cell/toolchain-3.3 SPU_GCC = $(TOOLCHAIN)/bin/spu-gcc PPU_GCC = $(TOOLCHAIN)/bin/ppu-gcc PPU_MBD = $(TOOLCHAIN)/bin/ppu-embedspu PPU_AR all: $(SPU_GCC) -O3 -c spu_code.c $(SPU_GCC) -o spu_code spu_code.o $(PPU_MBD) -m32 spu_code spu_code spu_code_embed.o $(PPU_AR) -qcs spu_code_lib.a spu_code_embed.o $(PPU_GCC) -m32 -O3 -c ppu_code.c $(PPU_GCC) -m32 -o cell_prog ppu_code.o spu_code_lib.a -lspe = $(TOOLCHAIN)/bin/ppu-ar

Figure 5.1: Rudimentary CELL makele.
At this point, the development environment is still evolving, and writing your own makele jeopardizes the portability of your compilation process to the next release of the SDK. IBM promotes the use of portable makeles based on samples provided with the SDK. The samples, which can be found in the directory /opt/ibm/cell-sdk/prototype/src/samples/, use the same makele structure, where the main directory contains the main makele and two subdirectories, ppu and spu for the PPU code and the SPU code, respectively. These subdirectories contain appropriate makeles for compiling the PPU code and SPU code. The makeles have a very simple structure, where the users responsibility is to provide a few basic denitions and include the le /opt/ibm/cell-sdk/prototype/make.footer, which handles aggregation in order to create one CELL executable.
5.2. PLAYSTATION 3 CLUSTER
Figure 5.2: Compilation/building [7].

PlayStation 3 Cluster

The essential part of a cluster development environment is MPI. In a standard setting, an MPI implementation ships with a specialized command called mpicc. It invokes a standard compiler with extra ags that allow inclusion of the standard MPI header le mpi.h. But the mpicc command can also be invoked to link MPI programs. In this mode, mpicc adds additional ags to the real linker invocation to add libraries that implement the MPI standard. This is a sample sequence of commands to compile an MPI program from the le foo.c:
$ mpicc -c foo.c $ mpicc -o foo foo.o
This is the preferred way of developing MPI applications, as it is independent of the MPI implementation being used as long as the right mpicc command gets invoked which can be adjusted with the
PATH environment variable.
On a PlayStation 3 cluster, this mode of operation breaks down if IBMs CELL SDK is combined with MPI and they are not installed on the same machine (especially in cross-compiling scenario). In this case, standard CELL SDK tools should be used with additional options added to make them aware of location of les required by the MPI implementation. Unfortunately, this will make the process dependent on a particular MPI implementation and its supporting libraries. But once setup properly, it will work just as well as it would with the mpicc command. To have the MPI header les available on the development machine they need to be copied from the include directory where the MPI implementation was installed. For MPICH1, the header les are mpi.h, mpidefs.h mpi errno.h, mpio.h, and mpi++.h (the last two only if support for MPI I/O and MPI C++ bindings were enabled during the setup). For MPICH2 the header les are mpi.h,

to the size of a cache line, which is 128 bytes.
H By default, DMA messages are not ordered. Ordering of DMAs can be enforced by the use
of barriers and fences. A barrier orders a message with respect to messages issued before as well as after a given message. A fence orders a message only with respect to messages issued before the given message (Figure 6.3).
H DMA transfers are non-blocking in their very nature. While DMAs are in progress, the SPE
should be doing some useful work and only check for DMA completion, when it comes to processing of the transferred data.
H DMA engines are parts of the SPEs. Each SPE can queue up to 16 requests in its own DMA
queue. Each DMA engine also has a proxy DMA queue, which can be accessed by the PPE and other SPEs. The proxy queue can hold up to eight requests.
Figure 6.3: Barrier/fence [2].
Both the SPEs and the PPE are capable of initiating DMAs, but the SPE-initiated DMAs are more efcient and should be given preference over the PPE-initiated DMAs. Nevertheless, if the need arises to use the PPE-initiated DMAs, it can be accomplished by means of the MFC SPE proxy command functions described in SPE Runtime Management Library, chapter SPE MFC Proxy Command Functions. Although each single SPE has a theoretical bandwidth of 25.6 GB/s, which is equal to the peak bandwidth of the main memory, a single SPE will have a hard time saturating this bandwidth. In order to get good utilization of the bus, you should initiate many requests from many SPEs, and also restrain from ordering the messages, if possible, to give the arbiter the most room for trafc optimization. An important aspect of the CELL communication system is the efciency of local store to local store communication. The main memory offers considerable bandwidth of 25.6 GB/s. At the same time, however, the bus connecting the PPE, the SPEs and the main memory possesses much greater internal bandwidth of 204.8 GB/s. The important aspect here is that the bus is almost impossible to saturate by communication between the interconnected elements. It means that the SPEs, when accessing the main memory heavily, will exhaust the memory bandwidth. At the same time, when communicating between one another, they will never encounter a communication bottleneck. By the same token, if an application has the potential for SPE to SPE communication, such communication should denitely be given preference over main memory communication. An example of such patterns would be stream processing, where data is passed from one SPE to another in a pipeline fashion. Although local store to local store transfers may seem a little less straightforward than main

memory transfers, in practice they are not difcult to implement at all. Given that the CELL processor implements a global addressing scheme, in which each local store can be accessed by its effective address, local store to local store communication can be implemented as follows:
H The PPE retrieves the effective address of each local store by calling the spe get ls() function. H The PPE passes the list of addresses of all local stores to all SPEs, through a DMA transfer. H On the SPE side, a communication buffer is declared as a global variable and as a result has
the same physical addresses within the local store on all SPEs.
H An SPE sums the physical buffer address with the effective address of the local store of another
SPE to get the address of the remote buffer. It uses this address as the source address to pull data from the other SPE, or as a destination address to push data to the other SPE. Local store to local store communication may prove invaluable not only for bulk data transfers, but also for synchronization between SPEs. One thing to remember here is the subvector alignment of source and destination for subvector length transfers. Mailboxes are a convenient mechanism for sending short, 32-bit messages from the PPE to the SPEs and between the SPEs. The mailboxes are First-In-First-Out (FIFO) queues, meaning the messages are processed in the order of their issue. Each SPE has a four-entry mailbox for receiving incoming messages from the PPE and other SPEs, and two one-entry mailboxes for sending outgoing messages to the PPE and other SPEs - one of which serves the purpose of raising an interrupt on the receiving device. Mailbox operations have blocking nature on the SPE. An attempt to write to a full outbound mailbox will stall until the mailbox is cleared by a PPE read. Similarly, an attempt to read from an empty inbound mailbox will stall until the PPE writes to the mailbox. The same does not apply to the PPE. Neither an attempt to write to a full mailbox nor an attempt to read an empty mailbox will stall the PPE. Mailboxes are useful to communicate short messages, such as completion ags or progress status. They can also serve the purpose of communicating short data, such as storage addresses and function parameters. The blocking nature of the mailboxes on the SPE side makes them perfect for the PPE to initiate actions on the SPEs. However for two reasons they should not be used by the SPEs to acknowledge completion of operations to the PPE. DMA completion has a local meaning on the SPE. In other words, completion of a DMA on the SPE means that the local buffers are available for reuse, but not that the data made it to the memory. If a DMA transfer is immediately followed by an acknowledgment mailbox message, the message can make it to the PPE before the data. Also, the PPE continuously reading the SPEs outbound mailbox will ood the bus causing loss of bandwidth. A better way of acknowledging completion of an operation or a data transfer from an SPE to the PPE is to use an acknowledgment DMA protected by a fence with respect to the data transfer DMA. The PPE can periodically test the memory location (variable) written to by the SPE, or even spin (busy wait) on the UT Knoxville 38 ICL

wait_SPEs_compl routine.

UT Knoxville 44 ICL
. for(i=0; i<n; i++){ copy_into_buf(buf, data[i]); MPI_comm(buf,.); start_SPEs(buf,.); wait_SPEs_compl(); }.
Figure 6.6: MPI example code.
Overlapping communications and computations in this case can be done using a method that is conceptually equivalent to applying the double-buffering technique (well known to CELL programmers) at a higher level. In order to achieve this overlapping we need to take advantage of MPI non-blocking communications and double the number of buffers needed for the communications. The general idea is that, while the SPEs carry on the computations related to step N of the loop, the PPE performs the communications related to step N + 1. The code in Figure 6.6 is transformed into the code in Figure 7.1 In the case where the PPE is involved in local computations, it is still possible to (partially) hide MPI communications. In fact, it is possible to take advantage of the fact that the PPE is a two-way, hyperthreaded processor. It is thus necessary to create two threads on the PPE, one that exclusively manages MPI communications while the other handles the local computations. Having a separate communication thread is not only benecial in situations when TCP/IP transport demands signicant PPE involvement in communication work. Current and future CELL-based platforms may be equipped with InniBand or maybe even 10 GigE/RDMA interconnections, which can, to a large extent, relieve the main processor from handling communication. A separate communication thread is still necessary to allow for overlapping communication with computation in many cases including collective communication.
. copy_into_buf(buf1, data[0]); MPI_Icomm(buf1,.); for(i=0; i<n-1; i++){ copy_into_buf(buf2, data[i+1]); MPI_Icomm(buf2,.); start_SPEs(buf1,.); MPI_wait_on_buf(buf2,.); wait_SPEs_compl(); swap_buffers(buf1, buf2); } start_SPEs(buf1,.); wait_SPEs_compl();.
Figure 6.7: MPI example code with communication/computation overlapping.

CHAPTER 7

The basic taxonomy for the CELL programming models was introduced by Kahle et at. [1]. Six models were distinguished, some of which can be qualied as PPE-centric and some as SPE-centric:
H Function ofoad model is one where the main application executes on the PPE and ofoads
performance-critical functions to the SPE by using calls to a library, possibly provided by a third party.
H Device extension model is a form of the function ofoad model, where the SPE provides

services previously delivered by a device or acts as an intelligent front-end for a device.
H Computational acceleration model is an SPE-centric model, where standard parallel programming techniques are used to implement most computationally intensive sections of code on the SPEs and the PPE acts mostly as a system service facility. Both shared and distributed memory programming techniques apply here.
H Streaming model is one where the SPEs are arranged in a pipeline, where each of them
applies a particular computational kernel to the data that passes through it. The model is very attractive due to the fact that the internal bandwidth greatly exceeds the main memory bandwidth. Load balancing may become an issues if the pipeline stages do not have a near equal amount of work.
CHAPTER 7. PROGRAMMING MODELS

7.1. COREPY

H Shared memory multiprocessor model can be utilized thanks to the DMA cache coherency
capabilities. A conventional shared memory store is replaced by a combination of a store to the local store and a DMA to shared memory with the PPE and all SPEs assigned to the same address space. Atomic update primitives can be used by utilizing the DMA lock line commands.
H Asymmetric thread runtime model is extremely exible and widespread on conventional
SMPs. On the SPEs, however, it would be very costly to implement full preemptive task switching, and some other model would have to be implemented, e.g., FIFO run-to-completion. Aside from this taxonomy, it is important to notice that the advent of a multi-core processor brings different communities together. In particular, the CELL processor seems to ignite similar enthusiasm in both the HPC/scientic community, the DSP/embedded community, and the GPU/graphics community. By the same token, the world of programming techniques proposed for the CELL is as diverse as the involved communities and includes shared-memory models, distributed memory models, and stream processing models, to name the most prominent ones. In this chapter, we present a brief overview of a few emerging frameworks for programming the CELL processor.

7.1 CorePy

CorePy is a research project at Indiana University, freely available for evaluation (http://www.
corepy.org). CorePy is library for rapid application development on the CELL processor that lets
developers create SPU and PPU programs using the Python programming language. At its heart, CorePy is a complete replacement for assembly-level programming on the CELL. It provides an API that includes Python functions for every PowerPC, VMX, and SPU instruction. These functions can be used to build highly optimized sub-programs, called synthetic programs, at run time. Once created, synthetic programs can be executed directly from Python on an SPU or PPU, synchronously or asynchronously. By combining very low-level code with a high-productivity language, CorePy enables new approaches to developing high-performance applications. In addition to the instruction-level APIs, CorePy includes libraries of components that abstract common operations. The Variable library provides objects for common data types with semantics similar to C data types. Instead of writing out the SPU instructions to add two vector registers, the Variable library lets developers use Python expressions to generate the instructions as the expression is evaluated. In the same vein, the Iterator library is a collection of Python iterators that generate optimized loops using Python syntax. The iterator library contains iterators for simple array iteration (scalar and vector), double-buffering between main memory and SPU local store, loop unrolling, and automatic block decomposition for executing loops across multiple SPUs. Synthetic programs can interact with any data available to the Python interpreter, making it possible to use synthetic programs in conjunction with other Python libraries, such as NumPy, for high-

IBM Accelerated Library Framework
IBM Accelerated Library Framework (ALF) [18] is a small programming environment attempting to simplify CELL code development by providing an interface to programming data-parallel applications. ALF is distributed as a part of the CELL SDK.
7.9. IBM ACCELERATED LIBRARY FRAMEWORK
ALF puts emphasis on division of labor between three types of programmers: compute kernel developers, accelerated library developers and application developers. In this scheme, kernel developers are responsible for writting optimized accelerator code, library developers are responsible for breaking the problem into the control process and the compute tasks. Application developers are supposed to be the users of the libraries, developed by the two groups just mentioned, and only program at the host level. ALF denes two types of tasks: control tasks, which naturally map to the PPE and compute tasks which map to the SPEs. The main programming construct is a compute task, which is run in parallel on the accelerators (SPEs). Compute task along with its data buffers constitutes a work block. Data buffers can be of input type, output type and overlapped input and output type. Single-use and multi-use work blocks are distinguished. A task context buffer allows for common persistent data that can be referenced by all work blocks. Work blocks are scheduled for execution by using work queues. ALF runtime supports double buffering. Rudimentary synchronization mechanisms are provided such as: barrier, notify and callback and query. ALF includes limited runtime error handling capabilities relying on callback error handlers, which can be registered by the programmer.

CHAPTER 8

Dense Linear Algebra
A fundamental capability of computers is to solve dense systems of linear equations. Also, it is the established way of benchmarking most powerful computers, which are currently ranked on the TOP500 list (http://www.top500.org/) according to their score on the Linpack benchmark [19], a linear system solver based on Gaussian elimination with partial pivoting. Unfortunately for the CELL processor, Linpack is dened in double precision, where the CELL delivers respectable but not astonishing performance. Results for the initial implementation of the Linpack benchmark in double precision were reported by Chen et al. [20]. For a matrix of size 2Kx2K they achieved 11.05 Gop/s, which is around 75% of the double precision peak. They have also implemented a single precision version of the code, which achieved 155 Gop/s (again around 75%efciency) for a matrix of size 4Kx4K. Unfortunately, a single precision algorithm does not legitimately implement the Linpack benchmark,. This is where the idea of iterative renement comes into play. Iterative renement is a well known method for improving the solution of a linear system of equations of the form Ax = b. First, the coefcient matrix A is factorized using LU decomposition into the product of a lower triangular matrix L and an upper triangular matrix U. Partial row pivoting is used to maintain numerical stability resulting in the factorization PA = LU , where P is the row permutation

9.2 CELL Processor Resources
At this point the Internet is full of CELL resources, but let us point you here to the best places to start looking for information. We recommend starting any search for documentation and software with the DeveloperWorks website http://www-128.ibm.com/developerworks/power/cell/, which also contains a very good forum for CELL developers. Another great resource is the website
of the Barcelona Supercomputer Center http://www.bsc.es/, which hosts both plenty of documentation and also serves as software repository. Finally, the CellPerformance website http:
//www.cellperformance.com/ contains useful software installation guidelines and practical performance tweaking tips.

Future

One of the major shortcomings of the current CELL processor for numerical application is the relatively slow speed of the double precision arithmetic. The next reincarnation of the CELL processor is going to include a fully-pipelined double precision unit, which will deliver the speed of 12.8 Gop/s from a single SPE clocked at 3.2 GHz, and 102.4 Gop/s from an 8-SPE system, what is going to make the chip a very hard competitor in the world of scientic and engineering computing. Although in agony, the Moores Law is still alive and we are entering the era of billion-transistor processors. Given that, the current CELL processor employs a rather modest number of transistors of 234 million. It is not hard to envision a CELL processor with more than one PPE and many more SPEs, perhaps reaching the performance of a TeraFlop/s for a single chip. That is still speculation though.

Bibliography

[1] J. A. Kahle, M. N. Day, H. P. Hofstee, C. R. Johns, T. R. Maeurer, and D. Shippy. Introduction to the Cell multiprocessor. IBM J. Res. & Dev., 49(4/5):589604, 2005. http://www.research.
ibm.com/journal/rd/494/kahle.pdf.
[2] IBM. Cell Broadband Engine Programming Handbook, Version 1.0, April 2006. [3] IBM. IBM Full-System Simulator Users Guide, Modeling Systems based on the Cell Broadband Engine Processort Version 2.0, November 2006. [4] IBM. Performance Analysis with the IBM Full-System Simulator, Modeling the Performance of the Cell Broadband Engine Processor, Version 2.0, November 2006. [5] IBM. IBM Full-System Simulator Command Reference, Understanding and Applying Commands in the IBM Full-System Simulator Environment, Version 0.01, October 2005. [6] IBM. Software Development Kit 2.0 Installation Guide, Version 2.0, December 2006. [7] A. J. Eichenberger et al. Using advanced compiler technology to exploit the performance of the Cell Broadband Engine architecture. IBM Sys. J., 45(1):5984, 2006. http://www.research.

 

Tags

SGH-Z150 IS-20 QD DP381B VGN-NS21m W S-GAP 810 RT-323 300 H RD-175 Maker ICM4 ATA 186 AD50VT-XL AKG C411 PK5000 Dynax 5 DEH-P2600R 18 USB PCG-FX804 WF8590NMW8 DJ-V5 Evil 2 SGH-P900 Europa CX800 P-O-X ESF6121 Coupe-cabriolet Keynote 2 DCR-TRV16 TF-TV1500 Nokia 1208 XM-754HX RM-V70T CBM-1000II SP2014N-KIT SC-LX71 PM-G860 Motokrzr K1 EX-Z60 BK Speaker DEM10 4X4-2006 DCR-HC27E FW-C330 W 83 L1900E-BF Force CTK-631 Edition FC-200V Francisco Cd70 PET710 MC-202 Suunto X6M DJ-F1E IF-AE8 FQ 1012 GR-D370 BV9250 WAG54GS SPP-C333 Garmin 250 TDM-7590R KL-750E AGM731F 870-G45 Victoria Electrolux Z65 HTS3510 DCR-IP1 AVI200 II SA-14 TXL37U10E SW65ASP Review Mdvd9900 HDR-UX19E Omni-848 HTC Gene Controller CDX-R30M Tablet GA-8SG667 FB162-A0U 800 PEL 3100C Gr-sxm740 ZTK123 M5-S5331 Bluechart CD2451S-24 PNA 3215 DS6211-2 Os 9 RC5200 Ideapad Y450 M1994D-PZ DI4000 1-1-0 AX523R UE46C8000

 

manuel d'instructions, Guide de l'utilisateur | Manual de instrucciones, Instrucciones de uso | Bedienungsanleitung, Bedienungsanleitung | Manual de Instruções, guia do usuário | инструкция | návod na použitie, Užívateľská príručka, návod k použití | bruksanvisningen | instrukcja, podręcznik użytkownika | kullanım kılavuzu, Kullanım | kézikönyv, használati útmutató | manuale di istruzioni, istruzioni d'uso | handleiding, gebruikershandleiding

 

Sitemap

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101