Category Archives: 计算机与 Internet

Stages of Software Development

by Christopher Diggins

 

Summary
What specifically are the phases of software development? I was taught there were about 4 or 5, but I believe I have identified a few more.


I was taught (back in ’94 by my software engineering professor) that the stages of software development were something like (my memory is hazy, so I am not probably giving her full justice):

  • gather requirements
  • design
  • implementation
  • debugging
  • testing

I believe that it is important to consider a more fine-grained and less linear view of the stages of software development. I consider the following to be important interleaved phases for the development of most non-trivial commercial software:

  • scheduling – Self explanatory.
  • research – Learning more about the problems the software attempts to solve, and what the competing software does.
  • technology selection – Choosing what tools, languages, and technologies to use to build and develop the software.
  • reuse – Identifying code libraries and tools internally and externally that can be leveraged
  • prototyping – An important step which is often overlooked (often-times the first version is really a prototype).
  • code documentation
  • product documentation
  • refactoring – Change in the code to changes in implementation design.
  • extending – This refers to when more features are added during development, after prototyping, or after a release
  • revising – Related to refactoring, this refers to when the product requirement are significantly changed in some-way
  • internationalization – It is is usually the case the software will be released in different locales with different languages and cultural conventions.
  • optimizing – It is rare that software doesn’t have some areas where better performance could significantly improve the product.
  • static analysis – Static analysis tools are an important part of detecting defects
  • reviewing code – Code reviews are an important supplement to testing
  • releasing – Getting the internal versions to various teams, and external versions to customers in a smooth and timely manner
  • recycling code – The code in a successful project will almost invariable be reused in some other project.
  • porting – Porting software to new operating systems and platforms is almost always inevitable in a successful product
  • support – Customer support is easily overlooked, but when taken into consideration will affect design decisions, and profitability.

By being aware of, and giving proper consideration to, these stages of software development I believe software projects increase their chances of success.

Advertisements

Terms: Superclass vs. Subclass

From "ATL Internals: Working with ATL 8, Second Edition".

SUPERCLASS

The Windows object model of declaring a window class and creating instances of that class is similar to that of the C++ object model. The WNDCLASSEX structure is to an HWND as a C++ class declaration is to a this pointer. Extending this analogy, Windows superclassing[3] is like C++ inheritance. Superclassing is a technique in which the WNDCLASSEX structure for an existing window class is duplicated and given its own name and its own WndProc. When a message is received for that window, it’s routed to the new WndProc. If that WndProc decides not the handle that message fully, instead of being routed to DefWindowProc, the message is routed to the original WndProc. If you think of the original WndProc as a virtual function, the superclassing window overrides the WndProc and decides on a message-by-message basis whether to let the base class handle the message.

[3] The theory of Windows superclassing is beyond the scope of this book. For a more in-depth discussion, see Win32 Programming (Addison-Wesley, 1997), by Brent Rector and Joe Newcomer.

The reason to use superclassing is the same reason to use inheritance of implementation: The base class has some functionality that the deriving class wants to extend. ATL supports superclassing via the DECLARE_WND_SUPERCLASS macro.

SUBCLASS

 

Previously in this chapter, I described superclassing as the Windows version of inheritance for window classes. Subclassing is a more modest and frequently used technique. Instead of creating a whole new window class, with subclassing, we merely hijack the messages of a single window. Subclassing is accomplished by creating a window of a certain class and replacing its WndProc with our own using SetWindowLong(GWL_WNDPROC). The replacement WndProc gets all the messages first and can decide whether to let the original WndProc handle it as well. If you think of superclassing as specialization of a class, subclassing is specialization of a single instance. Subclassing is usually performed on child windows, such as an edit box that the dialog wants to restrict to letters only. The dialog would subclass the child edit control during WM_INITDIALOG and handle WM_CHAR messages, throwing out any that weren’t suitable.

[11] For a more complete dissection of Windows subclassing, see Win32 Programming (Addison-Wesley, 1997), by Brent Rector and Joe Newcomer.

Terms: PascalCasing vs. camelCasing

The practice of marking all word boundaries in long identifiers (such as ThisIsASampleVariable) (including the first letter of the identifier) with uppercase. Constrasts with camelCasing, in which the first character of the identifier is left in lowercase (thisIsASampleVariable), and with the traditional C style of short all-lower-case names with internal word breaks marked by an underscore (sample_var).

Where these terms are used, they usually go with advice to use PascalCasing for public interfaces and camelCasing for private ones. They may have originated at Microsoft, but are in more general use in ECMA standards, among Java programmers, and elsewhere.

Terms: Monkey-Patch vs. Duck Typing

(from answers.com)

Monkey-Patch

A Monkey-Patch (also called Monkey Patch, MonkeyPatch) is a way to extend or modify runtime code without altering the original source code for dynamic languages (e.g. Ruby and Python).

They are also referred to as:

  • Guerilla patch
  • Extending previously declared classes
  • Reopening classes
  • Hijacking

Etymology

The term Monkey-Patch was first used as Guerilla Patch, which referred to changing code sneakily at runtime without any rules. In some applications (such as Zope 2) these patches would sometimes interact counter intuitively which was referred to as the patches engaging in battle with each other.

Due to the fact that the word guerilla and gorilla sound so similar people started using the incorrect term Gorilla Patch instead of Guerilla Patch. When a developer then created a Guerilla Patch they tried very hard to avoid any battles that may ensue due to the patch and the term Monkey-Patch was coined to make the patch sound less forceful.

The term Monkey-Patch caught on and has been in use ever since.

 Duck typing

In computer science, duck typing is a term for dynamic typing typical of some programming languages, such as Smalltalk or Visual FoxPro, where a variable’s value itself determines what the variable can do. It also implies that an object is interchangeable with any other object that implements the same interface, regardless of whether the objects have a related inheritance hierarchy.

The term is a reference to the "duck test"—"If it walks like a duck and quacks like a duck, it must be a duck." One can also say that the duck typing method ducks the issue of typing variables.

Dave Thomas is thought to have originated the term in the Ruby community.

Comparison with generics and structural subtyping

In [[C++]] and some other languages, very flexible static binding capabilities, called generics or templates or operator overloading, provided the same advantages, but typically not as late as run time. This static polymorphism was distinguished from runtime facilities for dynamic types, although most theorists considered this distinction to be undesirable. [citation needed]

The Smalltalk architects sought to achieve true polymorphism with the Smalltalk protocol proposal for abstract data types: static interfaces that existed only to guarantee a particular interface. Dynamic mechanisms used in these languages (such as genericizing the "method not found" exception handler into a catch-all lookup mechanism, parallels to which came to be called duck typing in Java and Python) would converge and employ a single reasonable syntax.

C++ templates implement a static form of duck typing. An iterator, for example, does not inherit its methods from an Iterator base class.

Yet another approach similar to duck typing is OCaml‘s structural subtyping, where object types are compatible if their method signatures are compatible, regardless of their declared inheritance. This is all detected at compile time through OCaml’s type inference system.

In Python

Duck typing is heavily used in Python. The Python Tutorial’s Glossary defines duck typing as follows:

Pythonic programming style that determines an object’s type by inspection of its method or attribute signature rather than by explicit relationship to some type object ("If it looks like a duck and quacks like a duck, it must be a duck.") By emphasizing interfaces rather than specific types, well-designed code improves its flexibility by allowing polymorphic substitution. Duck-typing avoids tests using type() or isinstance(). Instead, it typically employs hasattr() tests or EAFP (Easier to Ask Forgiveness than Permission) programming.

The standard example of duck typing in Python is file-like classes. Classes can implement some or all of the methods of file and can be used where file would normally be used. For example, GzipFile implements a file-like object for accessing gzip-compressed data. cStringIO allows treating a Python string as a file. Sockets and files share many of the same methods as well. However, sockets lack the tell() method and cannot be used everywhere that GzipFile can be used. This shows the flexibility of duck typing: a file-like object can implement only methods it is able to, and consequently it can be only used in situations where it makes sense.

In Java

In 2003, Dave Orme, leader of Eclipse’s Visual Editor Project was looking for a generic way to bind any SWT control to any JavaBeans-style object, sometimes incorrectly known as a POJO, short for Plain Old Java Object. He noticed that SWT reuses names religiously across its class hierarchy. For example, to set the caption of something, you normally set the Text property. This is true for an SWT Shell (window), a Text control, a Label, and many more SWT controls. Orme realized that if he could implement data binding in terms of the methods that are implemented on a control, he would save considerable work and achieve a much higher level of code reuse, compared to implementing separate data binding for an SWT Shell, Text, Label, and so on. When the Ruby community started describing this kind of type system as "duck typing", Orme realized that he had simply rediscovered what Smalltalk, Ruby, Python and other programmers had already known for a long time.

Orme formalized this knowledge by creating a class that makes duck typing simple and natural for Java programmers (see "Java Does Duck Typing"). Cedric Beust later cautioned about possible dangers using duck typing in "The Perils of Duck Typing".

In ColdFusion

ColdFusion, a web application scripting language, also allows duck typing although the technique is still fairly new to the ColdFusion developer community. Function arguments can be specified to be of type any so that arbitrary objects can be passed in and then method calls are bound dynamically at runtime. If an object does not implement a called method, a runtime exception is thrown which can be caught and handled gracefully. An alternative argument type of WEB-INF.cftags.component restricts the passed argument to be a ColdFusion Component (CFC), which provides better error messages should a non-object be passed in.

External links

Terms: Value Types vs. Ref Types(Continued)

In .NET, a type is Value or Reference is, in  essence,  about where it’s allocated – native memory space or managed memory space.

In C++, it’s decided by how you declare a variable, while in C# it’s determined by the type itself.

C++:

int i; // native stack

int *i = new int(); // native heap 

int ^i = gcnew int(); // managed heap

C#:

int i = new int();// native stack, initialed to be binary 0

int i; // native stack, not initialed

Terms: Mutex vs. Critical Section

Critical section provides synchronization which means for one process only, while mutexes allow data synchronization across processes.

critical section

In computer programming a critical section is a piece of code that accesses a shared resource (data structure or device) that must not be concurrently accessed by more than one thread of execution. A critical section will usually terminate in fixed time, and a thread, task or process will only have to wait a fixed time to enter it. Some synchronization mechanism is required at the entry and exit of the critical section to ensure exclusive use, for example a semaphore.

By carefully controlling which variables are modified inside and outside the critical section (usually, by accessing important state only from within), concurrent access to that state is prevented. A critical section is typically used when a multithreaded program must update multiple related variables without a separate thread making conflicting changes to that data. In a related situation, a critical section may be used to ensure a shared resource, for example a printer, can only be accessed by one process at a time.

How critical sections are implemented varies among operating systems.

The simplest method is to prevent any change of processor control inside the critical section. On uni-processor systems, this can be done by disabling interrupts on entry into the critical section, avoiding system calls that can cause a context switch while inside the section and restoring interrupts to their previous state on exit. Any thread of execution entering any critical section anywhere in the system will, with this implementation, prevent any other thread, including an interrupt, from getting the CPU and therefore from entering any other critical section or, indeed, any code whatsoever, until the original thread leaves its critical section.

This brute-force approach can be improved upon by using semaphores. To enter a critical section, a thread must obtain a semaphore, which it releases on leaving the section. Other threads are prevented from entering the critical section at the same time as the original thread, but are free to gain control of the CPU and execute other code, including other critical sections that are protected by different semaphores.

Some confusion exists in the literature about the relationship between different critical sections in the same program. In general, a resource that must be protected from concurrent access may be accessed by several pieces of code. Each piece must be guarded by a common semaphore. Is each piece now a critical section or are all the pieces guarded by the same semaphore in aggregate a single critical section? This confusion is evident in definitions of a critical section such as "… a piece of code that can only be executed by one process or thread at a time". This only works if all access to a protected resource is contained in one "piece of code", which requires either the definition of a piece of code or the code itself to be somewhat contrived.

Application Level Critical Sections

Application-level critical sections reside in the memory range of the process and are usually modifiable by the process itself. This is called a user-space object because the program run by the user (as opposed to the kernel) can modify and interact with the object. However the functions called may jump to kernel-space code to register the user-space object with the kernel.

Example Code For Critical Sections with Win32 API

/* Sample C/C++, Win9x/NT/ME/2000/XP, link to kernel32.dll */
#include <windows.h>
CRITICAL_SECTION cs; /* This is the critical section object -- once initialized, it cannot
                        be moved in memory */

/* Initialize the critical section -- This must be done before locking */
InitializeCriticalSection(&cs);

/* Enter the critical section -- other threads are locked out */
EnterCriticalSection(&cs);

/* Do some thread-safe processing! */

/* Leave the critical section -- other threads can now EnterCriticalSection() */
LeaveCriticalSection(&cs);

/* Release system object when all finished -- usually at the end of the cleanup code */
DeleteCriticalSection(&cs);

Note that on Windows NT (not 9x/ME), you can use the function TryEnterCriticalSection() to attempt to enter the critical section. This function returns immediately so that the thread can do other things if it fails to enter the critical section (usually due to the fact that another thread has locked it). Note that the use of a CriticalSection is not the same as a Win32 Mutex, which is an object used for inter-process synchronization. A Win32 CriticalSection is for inter-thread synchronization (and is much faster as far as lock times), however it cannot be shared across processes.

ǖ== Kernel Level Critical Sections ==

Typically, critical sections prevent process and thread migration between processors and the preemption of processes and threads by interrupts and other processes and threads.

Critical sections often allow nesting. Nesting allows multiple critical sections to be entered and exited at little cost.

If the scheduler interrupts the current process or thread in a critical section, the scheduler will either allow the process or thread to run to completion of the critical section, or it will schedule the process or thread for another complete quantum. The scheduler will not migrate the process or thread to another processor, and it will not schedule another process or thread to run while the current process or thread is in a critical section.

Similarly, if an interrupt occurs in a critical section, the interrupt’s information is recorded for future processing, and execution is returned to the process or thread in the critical section. Once the critical section is exited, and in some cases the scheduled quantum completes, these pending interrupt will be executed.

Since critical sections may execute only on the processor on which they are entered, synchronization is only required within the executing processor. This allows critical sections to be entered and exited at almost zero cost. No interprocessor synchronization is required, only instruction stream synchronization. Most processors provide the required amount of synchronization by the simple act of interrupting the current execution state. This allows critical sections in most cases to be nothing more than a per processor count of critical sections entered.

Performance enhancements include executing pending interrupts at the exit of all critical sections and allowing the scheduler to run at the exit of all critical sections. Further more, pending interrupts may be transferred to other processors for execution.

Critical sections should not be used as a long lived locking primitive. They should be short enough that the critical section will be entered, executed, and exited without any interrupts occurring, from neither hardware much less the scheduler.

 

semaphore (programming)

A semaphore is a protected variable (or abstract data type) and constitutes the classic method for restricting access to equivalent shared resources (e.g. storage) in a multiprogramming environment. They were invented by Edsger Dijkstra and first used in the THE operating system.

The value of the semaphore is initialized to the number of equivalent shared resources it is implemented to control. In the special case where there is a single equivalent shared resource, the semaphore is called a binary semaphore. The general case semaphore is often called a counting semaphore.

Semaphores are the classic solution to the dining philosophers problem, although they do not prevent all deadlocks.

Introduction

Semaphores can only be accessed using the following operations:

P(Semaphore s)
{
  await s > 0, then s := s-1; /* must be atomic once s > 0 is detected */
}

V(Semaphore s)
{
  s := s+1;   /* must be atomic */
}

Init(Semaphore s, Integer v)
{
  s := v;
}

Notice that incrementing the variable s must not be interrupted, and the P operation must not be interrupted after s is found to be nonzero. This can be done by special instruction (if the architecture’s instruction set supports it) or by ignoring interrupts in order to prevent other processes from becoming active.

The canonical names P and V come from the initials of Dutch words. V stands for verhoog, or "increase." Several explanations have been given for P (including passeer "pass," probeer "try," and pakken "grab"), but in fact Dijkstra wrote that he intended P to stand for the made-up portmanteau word prolaag,[1] short for probeer te verlagen, or "try-and-decrease."[2][3] (A less ambiguous English translation would be "try-to-decrease.") This confusion stems from the unfortunate characteristic of the Dutch language that the words for increase and decrease both begin with the letter V, and the words spelled out in full would be impossibly confusing for non–Dutch-speakers.

The value of a semaphore is the number of units of the resource which are free. (If there is only one resource, a "binary semaphore" with values 0 or 1 is used.) The P operation busy-waits (or maybe sleeps) until a resource is available, whereupon it immediately claims one. V is the inverse; it simply makes a resource available again after the process has finished using it. Init is only used to initialize the semaphore before any requests are made. The P and V operations must be atomic, which means that no process may ever be preempted in the middle of one of those operations to run another operation on the same semaphore.

In English textbooks, and in the programming language ALGOL 68, the P and V operations are sometimes called, respectively, down and up. In software engineering practice they are called wait and signal, or take and release, or pend and post.

To avoid busy-waiting, a semaphore may have an associated queue of processes (usually a FIFO). If a process performs a P operation on a semaphore which has the value zero, the process is added to the semaphore’s queue. When another process increments the semaphore by performing a V operation, and there are processes on the queue, one of them is removed from the queue and resumes execution.

Semaphores today as used by programmers

Semaphores remain in common use in programming languages that do not intrinsically support other forms of synchronization. They are the primitive synchronization mechanism in many operating systems. The trend in programming language development, though, is towards more structured forms of synchronization, such as monitors and channels. In addition to their inadequacies in dealing with deadlocks, semaphores do not protect the programmer from the easy mistakes of taking a semaphore that is already held by the same process, and forgetting to release a semaphore that has been taken.

Example usage

Since semaphores can have a count associated with them, they are usually made use of when multiple threads cooperatively need to achieve an objective. Consider this example:

We have a thread A that needs information from two databases, before it can proceed. Access to these databases is controlled by two separate threads B, C. These two threads have a message-processing loop; anybody needing their use posts a message into their message queue. Thread A initializes a semaphore S with init(S,-1). A then posts a data request, including a pointer to the semaphore S, to both B and C. Then A calls P(S), which blocks. The other two threads meanwhile take their time obtaining the information; when each thread finishes obtaining the information, it calls V(S) on the passed semaphore. Only after both threads have completed will the semaphore’s value be positive and A be able to continue. A semaphore used in this way is called a "counting semaphore."

Apart from a counting semaphore we also have a "blocking semaphore." A blocking semaphore is a semaphore that is initialized to zero. This has the effect that any thread that does a P(S) will block until another thread does a V(S). This kind of construct is very useful when the order of execution among threads needs to be controlled.

The simplest kind of semaphore is the "binary semaphore," used to control access to a single resource. It is essentially the same as a mutex. It is always initialized with the value 1. When the resource is in use, the accessing thread calls P(S) to decrease this value to 0, and restores it to 1 with the V operation when the resource is ready to be freed.

mutual exclusion

Mutual exclusion (often abbreviated to mutex) algorithms are used in concurrent programming to avoid the simultaneous use of un-shareable resources by pieces of computer code called critical sections.

Examples of such resources are fine-grained flags, counters or queues, used to communicate between code that runs concurrently, such as an application and its interrupt handlers. The problem is acute because a thread can be stopped or started at any time.

To illustrate: suppose a section of code is mutating a piece of data over several program steps, when another thread, perhaps triggered by some unpredictable event, starts executing. If this second thread reads from the same piece of data, the data, in the process of being overwritten, is in an inconsistent and unpredictable state. If the second thread tries overwriting that data, the ensuing state will probably be unrecoverable. These critical sections of code accessing shared data must therefore be protected, so that other processes which read from or write to the chunk of data are excluded from running.

A mutex is also a common name for a program object that negotiates mutual exclusion among threads, also called a lock.

Introduction

On a uniprocessor system the common way to achieve mutual exclusion is to disable interrupts for the smallest possible number of instructions that will prevent corruption of the shared data structure, the so-called "critical region". This prevents interrupt code from running in the critical region. Beside this hardware supported solution, some software solutions exist that use "busy-wait" to achieve the goal. Examples of these algorithms include:

In a computer in which several processors share memory, an indivisible test-and-set of a flag is used in a tight loop to wait until the other processor clears the flag. The test-and-set performs both operations without releasing the memory bus to another processor. When the code leaves the critical region, it clears the flag. This is called a "spinlock" or "busy-wait."

Some computers have similar indivisible multiple-operation instructions for manipulating the linked lists used for event queues and other data structures commonly used in operating systems.

Most classical mutual exclusion methods attempt to reduce latency and busy-waits by using queuing and context switches. Some claim that benchmarks indicate that these special algorithms waste more time than they save.

Many forms of mutual exclusion have side-effects. For example, classic semaphores permit deadlocks, in which one process gets a semaphore, another process gets a second semaphore, and then both wait forever for the other semaphore to be released. Other common side-effects include starvation, in which a process never gets sufficient resources to run to completion, priority inversion in which a higher priority thread waits for a lower-priority thread, and "high latency" in which response to interrupts is not prompt.

Much research is aimed at eliminating the above effects, such as by guaranteeing non-blocking progress. No perfect scheme is known.

References

  • Michel Raynal: Algorithms for Mutual Exclusion, MIT Press, ISBN 0-262-18119-3
  • Sunil R. Das, Pradip K. Srimani: Distributed Mutual Exclusion Algorithms, IEEE Computer Society, ISBN 0-8186-3380-8
  • Thomas W. Christopher, George K. Thiruvathukal: High-Performance Java Platform Computing, Prentice Hall, ISBN 0-13-016164-0

See also

Mutually exclusive

External links

Terms: Value Types vs. Ref Types

While searching for an answer about how to tell a type is ref or value, I came through this blog – "Value Types, Reference Types, and writing with clarity!", which is quite concise and practical, though my question is not answered here.

All .NET Framework data types are either value types or reference types.
Value Types
Memory for a value type is allocated on the current thread’s stack. A value type’s data is maintained completely within this memory allocation. The memory for a value type is maintained only for the lifetime of the stack frame in which it is created. The data in value types can outlive their stack frames when a copy is created by passing the data as a method parameter or by assigning the value type to a reference type. Value types are passed by value by default . "By Value" is when an argument is passed into a function by passing a copy of the value. In this case, changing the copy doesn’t affect the original value,

If a value type is passed to a parameter of reference type, a wrapper object is created (the value type is boxed), and the value type’s data is copied into the wrapper object. For example, passing an integer to a method that expects an object results in a wrapper object being created.
Reference Types
The data for reference type objects is always stored on the managed heap. Variables that are reference types consist of only the pointer to that data. The memory for reference types such as classes, delegates, and exceptions is reclaimed by the garbage collector when they are no longer referenced. It is important to know that reference types are always passed by reference. "By Reference" is when an argument is passed to a function by passing a reference to the actual value. In this case, if you change the argument in the function, you also change the original.

If you specify that a reference type should be passed by value, a copy of the reference is made and the reference to the copy is passed *.

 

The answer to my question is, quoted from Applied Microsoft .NET Framework Programming, Looking up in MSDN.

The .NET Framework Reference documentation clearly indicates which types are reference types and which are value types. When looking up a type in the documentation, any type called a class is a reference type. For example, the System.Object class, the System.Exception class, the System.IO.FileStream class, and the System.Random class are all reference types. On the other hand, the documentation refers to each value type as a structure or an enumeration. For example, the System.Int32 structure, the System.Boolean structure, the System.Decimal structure, the System.TimeSpan structure, the System.DayOfWeek enumeration, the System.IO.FileAttributes enumeration, and the System.-Drawing.FontStyle enumeration are all value types.
If you look more closely at the documentation, you’ll notice that all the structures are immediately derived from the System.ValueType type. System.ValueType is itself
immediately derived from the System.Object type. By definition, all value types must be derived from ValueType.
Note
All enumerations are derived from System.Enum, which is itself derived from System.ValueType. The CLR and all programming languages give enumerations special treatment. For more information about enumerated
types, refer to Chapter 13.
Even though you can’t choose a base type when defining your own value type, a value type can implement one or more interfaces if you choose. In addition, the CLR doesn’t allow a value type to be used as a base type for any other reference type or value type. So, for example, it’s not possible to define any new types using Boolean, Char, Int32, Uint64, Single, Double, Decimal, and so on as base types.
Important
For many developers (such as unmanaged C/C++ developers), reference types and value types will seem strange at first. In unmanaged C/C++, you declare a type and then the code that uses the type gets to decide if an instance of the type should be allocated on the thread’s stack or in the application’s heap. In managed code,
the developer defining the type indicates where instances of the type are allocated; the developer using the type has no control over this.

Terms: glob / 术语解释:glob

American Heritage Dictionary:

glob (glŏb)
n.

  1. A small drop; a globule.
  2. A soft thick lump or mass: a glob of mashed potatoes; globs of red mud.

[Middle English globbe, large mass, from Latin globus, globular mass.]

Wikipedia:

glob() is a Unix library function that expands file names using a pattern matching notation reminiscent of regular expression syntax but without the expressive power of true regular expressions. The word "glob" is also used as a noun when discussing a particular pattern, e.g. "use the glob *.log to match all those log files".

The term glob is now used to refer more generally to limited pattern matching facilities of this kind in other contexts. Larry Wall‘s Programming Perl discusses glob in the context of the Perl language. Similarly, Tcl contains both true regular expression matching facilities and a more limited kind of pattern matching often described as globbing.

Glob is also the name of an Italian television comedy produced by Enrico Bertolino which addresses the language of communication used by the mass media and other such topics. Brilliant and amusing, it offers numerous observations on the poor communication of television journalists.

See also

 

Hacker Slang:

[Unix; common] To expand special characters in a wildcarded name, or the act of so doing (the action is also called globbing). The Unix conventions for filename wildcarding have become sufficiently pervasive that many hackers use some of them in written English, especially in email or news on technical topics. Those commonly encountered include the following:

*
wildcard for any string (see also UN*X)

?
wildcard for any single character (generally read this way only at the beginning or in the middle of a word)

[]
delimits a wildcard matching any of the enclosed characters

{}
alternation of comma-separated alternatives; thus, ‘foo{baz,qux}’ would be read as ‘foobaz’ or ‘fooqux’

Some examples: “He said his name was [KC]arl” (expresses ambiguity). “I don’t read talk.politics.*” (any of the talk.politics subgroups on Usenet). Other examples are given under the entry for X. Note that glob patterns are similar, but not identical, to those used in regexps.

Historical note: The jargon usage derives from glob, the name of a subprogram that expanded wildcards in archaic pre-Bourne versions of the Unix shell.