Preview

External Sort

Powerful Essays
Open Document
Open Document
4433 Words
Grammar
Grammar
Plagiarism
Plagiarism
Writing
Writing
Score
Score
External Sort
Sorting

CS 102
File Structures & File Organizations

Sorting – arranging the items in a list in ascending or descending order by a key value. Applicable for all file organizations, not just sequential Why sort ? to make a report, to merge files in queries, to merge files in master file maintenance, to make searches easier, to prioritize, etc.

Chapter 05

External Sorting Algorithms

Internal vs External Sorts
Internal Sort – sorting items entirely in main memory ICS 2, ICS 3, CS 101 External Sort – sorting files in secondary storage using main memory CS 102 Why external sort ?
Some files may be too large to fit in main memory

Some Terminologies
A Pass – an iteration that goes through the items (or records) of a list (or file) once to include reading it from file, processing it in main memory and writing it to file. A Run – a grouping of some items of a list. Usually a run starts as a block of records but eventually increases in size. Size of a Run – the number of items in a run. Usually no less than the blocking factor. A Merge – combining lists into one

The Algorithms
External Sort Algorithms 2-way Sort Merge Balanced 2-way Sort Merge Balanced k-way Sort Merge Polyphase Sort Merge Overview :

2-Way Sort Merge
A simple 2-way Sort Merge repeatedly merges 2 smaller sorted components of a file into a sorted bigger component of the file. The algorithm Phase 1 : The Sort Phase Phase 2 : The Merge Phase

The Sort Phase
Phase 1 : The Sort Phase Divide the records of a file into several runs, internal sort the records in a run, and distribute the runs “evenly” to two external files file_1 and file_2

The Merge Phase
Phase 2 : The Merge Phase For each pair of runs, one from file_1 and another from file_2, merge the pair resulting in a longer run. Store the new resulting run in a third external file file_3

Redistribute the runs evenly in file_3 to file_1 and file_2 Repeat Phase 2 until all records are in one long run.

Tips for

You May Also Find These Documents Helpful

Related Topics