Users Online
· Members Online: 0
· Total Members: 188
· Newest Member: meenachowdary055
Forum Threads
Latest Articles
Articles Hierarchy
Connect commands into a pipeline
Connect commands into a pipeline
In this module, you'll learn how to connect commands into a pipeline. You'll also learn about filtering left, formatting right, and other important principles.
Learning objectives
After you complete this module, you'll be able to:
- Explore cmdlets further and construct a sequence of them in a pipeline.
- Apply sound principles to your commands by using filtering and formatting.
Prerequisites
- Basic familiarity with using command-line shells, such as a Command Prompt window or Git Bash
- Visual Studio Code, installed
- Familiarity with installing Visual Studio Code extensions
- Familiarity with installing software on Windows or other operating systems
- Familiarity with running commands in PowerShell
Introduction
In PowerShell, you run compiled commands, or cmdlets. By connecting these cmdlets, you can create powerful combined statements, or pipelines. You'll find such combined statements useful as you're looking to automate your workflows.
As part of creating these pipelines, it's beneficial to understand how to format the output to your liking. For example, a carefully selected output format might allow you to quickly get an overview of a situation, or make it easier to fit the output into a report.
When you learn to combine cmdlets, you unlock much of the power that's built into PowerShell.
Learning objectives
After you complete this module, you'll be able to:
- Explore cmdlets further and construct a sequence of them in a pipeline.
- Apply sound principles to your commands by using filtering and formatting.
Selecting data
Running a command can be powerful, you get data from your local machine or from across the network. To be even more effective, you need to learn how to get the data that you want. Most commands operate on objects, as input or as output, or both. Objects have properties and you may want to access a subset of those properties and present them in a report. You might also want to sort the data based on one or more properties. But how do you get there?
Use Get-Member to inspect output
When you pass the results of a command to Get-Member
, Get-Member
returns information about the object, like:
- The type of object being passed to Get-Member.
- The Properties of the object that may be evaluated.
- The Methods of the object that may be executed.
Let's demonstrate this fact by running Get-Member
on the command Get-Process
.
>Get-Process | Get-Member
Note how you are using the pipe |
and that by calling Get-Member
, you are in fact creating a pipeline already. The first few lines of output from the preceding statement look like so:
TypeName: System.Diagnostics.Process
Name MemberType Definition
---- ---------- ----------
Handles AliasProperty Handles = Handlecount
Name AliasProperty Name = ProcessName
NPM AliasProperty NPM = NonpagedSystemMemorySize64
PM AliasProperty PM = PagedMemorySize64
SI AliasProperty SI = SessionId
VM AliasProperty VM = VirtualMemorySize64
WS AliasProperty WS = WorkingSet64
...
The output shows the type of object that the Get-Process
command returns (System.Diagnostics.Process
). The rest of the response shows the name, type, and definition of the object's members. You can see that If you want to fit Get-Process
with another command in a pipeline, pairing it with Get-Member
is a good first step.
Select-Object
By default, when you run a command that is going to output to the screen, PowerShell automatically adds the command Out-Default
. If the data isn't just a collection of strings, but objects - PowerShell looks at the object type to determine if there's a registered view for that object type, and if so, it uses that view.
The view generally doesn't contain all the properties of an object because it wouldn't display properly on screen, so only some of the most common properties are included in the view.
You can override the default view by using Select-Object
and choosing your own list of properties. You can then send those properties to Format-Table
or Format-List
, to display the table however you like.
Consider the result of running Get-Process
on the process zsh
:
NPM(K) PM(M) WS(M) CPU(s) Id SI ProcessName
------ ----- ----- ------ -- -- -----------
0 0.00 0.01 0.38 644 620 zsh
0 0.00 0.01 0.38 727 727 zsh
0 0.00 0.01 0.38 731 731 zsh
0 0.00 0.01 0.38 743 743 zsh
0 0.00 0.01 0.38 750 750 zsh
0 0.00 0.88 0.91 15747 …47 zsh
0 0.00 0.01 0.29 41983 …83 zsh
0 0.00 1.16 0.31 68298 …98 zsh
What you see is a view that represents what you most likely want to see from this command. However, this view doesn't show you a complete set of information. In order to see something different, you can explicitly specify which properties you want to see in the result.
Getting the full response
What you've seen so far is a limited response. To present the full response, you use a wildcard *
, like so:
>Get-Process zsh | Format-List -Property *
The *
character shows you every attribute and its value, which allows you to investigate the values you're interested in. The full response also uses presentation names for properties instead of the actual property names, and presentation names look good in a report.
Despite these benefits, you may not want a full output of data, but you may not be content with the default response either.
Selecting specific columns
To limit the response and find a middle ground between the default response and the full response, you want to select some properties you're interested in and have that as parameter input to Select-Object
. But, and here's a problem, you need to use the real names for the columns. How do you find out the real names? Use Get-Member
. A call to Get-Member
gives you all the properties and their actual names.
Finding the real property name
Let's quickly recap on the default response, with this subset:
NPM(K) PM(M) WS(M) CPU(s) Id SI ProcessName
------ ----- ----- ------ -- -- -----------
0 0.00 0.01 0.38 644 620 zsh
From the default response, the properties Id
and ProcessName
are most likely called the same, but CPU(s) is a presentation name, real property names tend to consist of only text characters and no spaces. To find out the real name for a specific property, you can use Get-Member
:
>Get-Process zsh | Get-Member -Name C*
You now get a list of all members with names that start with a C
. Among them is CPU
, which is likely what you want:
TypeName: System.Diagnostics.Process
Name MemberType Definition
---- ---------- ----------
CancelErrorRead Method void CancelErrorRead()
CancelOutputRead Method void CancelOutputRead()
Close Method void Close()
CloseMainWindow Method bool CloseMainWindow()
Container Property System.ComponentModel.IContainer Container {get;}
CommandLine ScriptProperty System.Object CommandLine {get=…
Company ScriptProperty System.Object Company {get=$this.Mainmodule.FileVersionInfo.CompanyName;}
CPU ScriptProperty System.Object CPU {get=$this.TotalProcessorTime.TotalSeconds;}
You now know how to use Select-Object
to ask for exactly what you need with the correct property names, like so:
Get-Process zsh | Select-Object -Property Id, Name, CPU
And here it is:
Id Name CPU
-- ---- ---
644 zsh 0.3812141
727 zsh 0.3826498
731 zsh 0.3784953
743 zsh 0.3776352
750 zsh 0.3824036
15747 zsh 0.9097993
41983 zsh 0.2934763
68298 zsh 0.3121695
This sequence of commands gives you an output that differs from the default output but contains properties that you care about.
Sorting
When using Sort-Object
in a pipeline, PowerShell sorts the output data by using the default properties first. If no such properties exist, it then tries to compare the objects themselves. The sorting is either by ascending or descending order.
By providing properties, you can choose to sort by specific columns, like so:
>Get-Process | Sort-Object -Descending -Property Name
In the preceding command, we're sorting by the column Name
in descending order. To sort by more than one column, separate the column names with a comma, like so:
>Get-Process | Sort-Object -Descending -Property Name, CPU
In addition to sorting by column name, you can also provide your own custom expression. In this example, we use a custom expression to sort by the columns Name
and CPU
and control the sort order for each column.
>Get-Process 'some process' | Sort-Object -Property @{Expression = "Name"; Descending = $True}, @{Expression = "CPU"; Descending = $False}
The preceding example demonstrates how powerful and flexible Sort-Object
can be. This topic is a bit advanced and out of scope for this module, but will be revisited in more advanced modules.
Exercise - Construct a pipeline
Discover the most-used processes on your machine
To manage your machine, you sometimes need to discover what processes run on it and how much memory and CPU they consume. This information tells you what the machine spends its resources on. You can use this information to decide whether to introduce new processes on your machine, to leave the machine as it is, or to free resources by closing resource-intensive processes. The more you know about the processes that run on your machine, the better.
-
Type
pwsh
in a terminal window to start a PowerShell session:>pwsh
-
To begin, run the command
Get-Process
, and pipe in the cmdletsWhere-Object
,Sort-Object
, andSelect-Object
.>Get-Process | Where-Object CPU -gt 2 | Sort-Object CPU -Descending | Select-Object -First 3
The exact output you see depends on your machine, but you should see a result where the first 3 processes whose CPU value is greater than 2 are sorted in descending order, with the greatest CPU value at the top of the list. Your output will look similar to the following example:
NPM(K) PM(M) WS(M) CPU(s) Id SI ProcessName
------ ----- ----- ------ -- -- -----------
0 0.00 100.00 120,000.00 4000 1 some-process-name
0 0.00 100.00 30,000.66 400 1 some-other-process-name
0 0.00 100.00 27,000.00 500 1 a-process
Use formatting and filtering
When you work with PowerShell, filtering and formatting are important concepts to understand, for a couple of reasons. First, you want to create a pipeline that produces the result you want. Second, you want to do so efficiently, in terms of how you pull data over the network and how you ensure that the result is something you can work with.
Filtering left
In a pipeline statement, filtering left means filtering for the results you want as early as possible. You can think of the term left as early, because PowerShell statements run from left to right. The idea is to make the statement fast and efficient by ensuring that the dataset you operate on is as small as possible. This principle really comes into play when your commands are backed by larger data stores or you're bringing back results across the network.
Consider the following statement:
Get-Process | Select-Object Name | Where-Object Name -eq 'name-of-process'
This statement first retrieves all of the processes on the machine. It ends up formatting the response so that only the Name
property is listed. This statement doesn't follow the filtering left principle, because it operates on all the processes, attempts to format the response, and then filters at the end.
It's better to filter first and then format, as in the following statement.
Get-Process | Where-Object Name -eq name-of-process | Select-Object Name
Often, a cmdlet that offers filtering is more efficient than using Where-Object
. Here's a more efficient version of the preceding statement:
>Get-Process -Name name-of-process | Select-Object Name
In this version, the parameter -Name
does the filtering for you.
Formatting right, formatting as the last thing you do
Whereas filtering left means to filter something as early as possible in a statement, formatting right means to format something as late as possible in the statement. Ok, but why do I need to format late? The answer is because format commands alter the resulting object so that your data is no longer found in the same properties. This alteration impacts your ability to retrieve the information you want by using pipe commands, Select-Object
, or by looping through the object with foreach
.
The formatting destroys the object with which you're dealing. Take the following call for example:
>Get-Process 'some process' | Select-Object Name, CPU | Get-Member
The type you get back is System.Diagnostics.Process
. Now, add the Format-Table
formatter like so:
>Get-Process 'some process' | Format-Table Name,CPU | Get-Member
If you only focus on the types you get back, you'll notice you're getting back something different:
TypeName: Microsoft.PowerShell.Commands.Internal.Format.FormatStartData
TypeName: Microsoft.PowerShell.Commands.Internal.Format.GroupStartData
TypeName: Microsoft.PowerShell.Commands.Internal.Format.FormatEntryData
TypeName: Microsoft.PowerShell.Commands.Internal.Format.GroupEndData
What these types are, isn't important for this lesson. What is important is to realize that when you use any type of formatting command, your data is different and when it's different it might no longer contain the columns you care about. Let's illustrate this with an example:
>Get-Process 'some process' | Select-Object Name, Cpu
The preceding command gives you a result with the columns Name
and CPU
.
Name CPU
---- ---
zsh 1.2984395
zsh 0.2522047
zsh 0.2486375
zsh 0.2683466
zsh 0.2681874
zsh 1.6799438
zsh 0.2909816
zsh 0.7855272
Let's use formatting first and then Select-Object
, to illustrate what might happen if you don't format last:
>Get-Process 'some process' | Format-Table Name,CPU | Select-Object Name, CPU
The result now looks like so:
Name CPU
---- ---
It's empty, because Format-Table
transformed your object by placing data into other properties. Your data isn't gone, only your properties. The preceding PowerShell command makes an attempt to find the properties but is unable to.
Formatting commands should be the last thing you use in your statement because they're meant for formatting things nicely for screen presentation. They aren't meant to be used as a way to filter or sort your data.
Formatting commands
The most common cmdlets to use to format your output are Format-Table
and Format-List
. By default, most cmdlets format output as a table. If you don't want your output to display properties in columns, use the Format-List
cmdlet to reformat them as a list.
Exercise - Format your output
Compare formatting approaches
Different output formats make sense for different scenarios. For example, depending on the type of data that you want to display, a table might make more sense than a list.
Some cmdlets use a certain type of formatting by default. You can override the default formatting by using a formatting cmdlet.
-
Type
pwsh
in a terminal window to start a PowerShell session:Bashpwsh
-
In your PowerShell shell, run the
Get-Member
command:PowerShell"a string" | Get-Member
The output is a table that lists all the members. Here are the first few lines of the output:
OutputName MemberType Definition ---- ---------- ---------- Clone Method System.Object Clone(), System.Object ICloneable.Clone() CompareTo Method int CompareTo(System.Object value), int CompareTo(string strB), int IComparable.CompareTo(…
Next, you override the default formatting by using the
Format-List
cmdlet. -
Run the
Format-List
command, as shown here:PowerShell"a string" | Get-Member | Format-List
The resulting output is different from the preceding output. The first few lines now appear as a list, as shown here:
OutputTypeName : System.String Name : Clone MemberType : Method Definition : System.Object Clone(), System.Object ICloneable.Clone() TypeName : System.String Name : CompareTo MemberType : Method Definition : int CompareTo(System.Object value), int CompareTo(string strB), int IComparable.CompareTo(System.Object obj), int IComparable[string].CompareTo(string other)
Knowledge check
Choose the best response to each question, and then select Check your answers.
Summary
In this module, you started by learning what a pipeline is.
A pipeline is a set of commands connected by the pipe (|
) character. The idea is to have the output of one command serve as the input to the next command.
As part of constructing the pipeline, you learned that you first need to evaluate whether a command fits, and can be added as the next command in the pipeline. A command fits if its output matches the input that's needed for the next command to run.
The Get-Help
command helps you inspect the command, and the INPUT and PARAMETERS sections can help you understand what types of input a command accepts. For pipeline input, you need to find parameters that have the property Accept pipeline input?
set to True.
There's also an evaluation order that reveals how the input is analyzed for validity. Input can be considered valid in either of two main ways. The first way is called By value, which means that the input matches an array to a specific type. The second way, By property name, means that whatever type of object is passed in, it must have a property with a particular name.
Finally, you learned about filtering and formatting. The filtering left concept is important because it dictates that you should filter as close to the data source as possible. That is, it should be input as far left, or early, in the statement as possible. This placement is especially important when you work on large data stores and you need data to be returned over the network. Formatting right means that any output formatting should be placed as far to the right, or late, in the statement as possible.
Additional resources
- About pipelines
- What is a pipeline?
- Use filtering left
- Use formatting right
- Use Where-Object commands
- Use Format commands