structured-text-tools

授权协议 Readme
开发语言 JavaScript
所属分类 应用工具、 终端/远程登录
软件类型 开源软件
地区 不详
投 递 者 谷梁嘉运
操作系统 跨平台
开源组织
适用人群 未知
 软件概览

Structured text tools

The following is a list of text-based file formats and command line tools for manipulating each.

Contents

Awk-like

Tools that work with lines of fields separated by delimiters but do not necessarily support CSV field quoting.

Awk

Awk is a POSIX-standard command line tool and programming language. If you use Linux, macOS, or a BSD, you almost certainly have it installed. See below for Windows.

  • If you already know how to program, the nawk man page is a great way to learn Awk quickly. What you learn from it will apply to other implementations on different platforms. Read it first if you feel overwhelmed by the sheer size of the GNU Awk manual.
  • Awk.info archive — an extensive resource on Awk.
  • AWK Vs NAWK Vs GAWK — a comparison of features present in different implementations.
  • busybox-w32 includes a full implementation of POSIX Awk and other tools like sed in a single Windows executable.
  • GNU Awk 5 binaries for Windows by EZWinPorts.

POSIX commands

Name Description
comm Select the lines common to two sorted files or the lines contained in only one of them. (Manual: man 1 comm on your system, GNU, FreeBSD.)
cut Select portions of each line in one or more files. (Manual: man 1 cut, GNU, FreeBSD.)
grep Select the lines that match or do not match a pattern from one or more files. (Manual: man 1 grep, GNU, FreeBSD.)
join Take two files sorted by a common field and join their lines on the value of that field. Lines with values that do not appear in the other file are discarded. (Manual: man 1 join, GNU, FreeBSD.)
paste Combine several consecutive lines in a text file into one. (Manual: man 1 paste, GNU, FreeBSD.)
sort Sort lines by key fields. (Manual: man 1 sort, GNU, FreeBSD.)
uniq Find or remove repeated lines. (Manual: man 1 uniq, GNU, FreeBSD.)

Other tools

Name Description
csvquote See the CSV section.
GNU datamash Perform statistical operations on text input.
Hawk Transform text from the command-line using Haskell expressions.
rq See the JSON section.

CSV

CSV, TSV, and other delimiter-separated value formats. Tools belong on this list if they support field quoting.

Name and link Description
csv-nix-tools List *nix system information such as environment variables, files, processes, network connections, users as CSV. Manipulate and pretty-print CSV. Execute CSV rows as commands.
csv2md Convert CSV to Markdown tables.
csv2html Convert CSV to HTML tables.
csvfaker Generate CSV files with fake data. Supports different types of fake data in different locales: names, cities, jobs, email addresses, and others.
csvfix (unofficial mirror) A multitool. Compare, filter, normalize, split, and validate CSV files. Reorder, remove, split, and merge fields. Convert data between fixed-width, multi-line, XML, and DSV format. Generate SQL statements.
csvkit csvkit is a suite of command-line tools for converting to and working with CSV: convert, clean, cut, grep, join, sort, stack, format, render, query, analyze, etc.
csvquote Transform CSV to and from a format processable with Awk-like tools.
csvtk Search, sample, cut, join, transpose, and sort CSV/TSV files. Rename columns. Replace fields and generate new fiends from existing fields. Plot data as vector or raster histograms and box, line, and scatter plots. Convert CSV to Markdown. Convert XLSX to CSV. Split XLSX sheets.
CSVtoTable Convert CSV to a searchable and sortable HTML table.
dasel See the JSON section.
Graphtage See the JSON section.
jp (sgreben) Plot data. See the JSON section.
Mario See the JSON section.
MCMD (M-Command) Select, sample, cut, join, sort, reformat, and generate CSV files. Contains a large set of commands.
Miller sed, awk, cut, join and sort for name-indexed data such as CSV and tabular JSON.
pawk Process text with Awk-like patterns, but Python code.
rows A Python library with a CLI. Convert between a number of file formats for tabular data: CSV, XLS, XLSX, ODS, and others. Query the data (via SQLite). Combine tables. Generate schemas.
rq See the JSON section.
scrubcsv Remove bad lines from a CSV file and normalize the rest. Written in Rust.
tab A non-Turing-complete statically typed programming language for data processing. An alternative to Awk.
teip Select fields, character ranges, or regular expression matches from the standard input. Replace them with the output of a command.
eBay's TSV utilities Filtering, statistics, sampling, joins and other operations on TSV files. High performance, especially good for large datasets. Written in D.
tv View delimited files in the terminal.
VisiData Explore interactively data in TSV, CSV, XLS, XLSX, HDF5, JSON, and other formats. Introduction.
xsv Index, slice, analyze, split, and join CSV files.

SQL-based tools

See the big comparison table. It covers

  • AlaSQL CLI
  • csvq
  • csvsql
  • fsql
  • q
  • RBQL
  • rows
  • Sqawk (dbohdan)
  • sqawk (tjunier)
  • Squawk
  • termsql
  • trdsql
  • textql

JSON

Name and link Description
clconf See the YAML section.
dasel Query and update data structures from the command line. Comparable to jq/yq but supports JSON, TOML, YAML, and XML. Static binaries available for releases.
fx Run arbitrary JavaScript on JSON input. Standalone binaries available.
gojq A pure Go implementation of jq (see below). Supports YAML input and output.
Graphtage Compare and merge tree-like structures semantically. Supports JSON, JSON5, XML, HTML, YAML, and CSV. Can be used as a Python library.
gron Convert JSON to and from flat, greppable lists of "path=value" statements.
JC Convert the output of standard command line tools to JSON.
jello Query JSON and JSON Lines with Python code. Output the result in a line-based format suitable for creating Bash arrays. Generate a grep-able schema.
jet Convert between and query JSON, Clojure's edn, and Transit.
jfq Query and transform JSON with the JSONata language.
jid Explore JSON interactively with filtering queries like jq.
jiq Explore JSON interactively with jq. Requires jq.
jj Query and modify values in JSON or JSON Lines with a key path.
jl Query and manipulate JSON using a tiny functional language.
jo Create JSON objects from the shell.
jp (jmespath) Query JSON with JMESPath.
jp (sgreben) Plot JSON and CSV data in the terminal. Supports different kinds of plots: bar charts, line charts, scatter plots, histograms, and heatmaps.
jplot Plot real-time JSON data in the terminal (works with terminals supporting graphic rendering).
jq Create and manipulate JSON with a functional (as in "functional programming") DSL. Can convert JSON to other formats.
jql Create and manipulate JSON with a Lisp-syntax DSL.
jtbl Format JSON or JSON Lines as a plain-text table.
jtc Create, manipulate, search, validate JSON with path expressions. Can be used as a C++14 library.
emuto CLI tool similar to jq. Create and manipulate JSON and other files. Can be compiled to JavaScript.
jshon Create and manipulate JSON using getopt-style command-line options.
json2 Convert JSON to and from flat, greppable lists of "path=value" statements. Modeled after xml2.
jsonaxe Create and manipulate JSON with a Python-based DSL. Inspired by jq.
json Run arbitrary JavaScript on JSON input.
json-table Convert nested JSON into CSV or TSV for processing in the shell.
json.tool (Python 3 docs) Validate and pretty-print JSON. This module is part of the standard library of Python 2/3 and is likely to be available wherever Python is installed.
jsonwatch Track changes in JSON data from the command line. Works like watch -d.
lobar Explore JSON interactively or process it in batch with a wrapper for lodash.chain(). An alternative to jq with a JavaScript syntax.
Mario Manipulate and convert between CSV, JSON, YAML, TOML, and XML with Python code.
qpyson Query and manipulate JSON with Python.
query-json A faster jq implementation written in Reason Native (OCaml).
quicktype Infer the underlying model of the JSON and output as types for various programming languages or JSON Schema. CLI and Web UI.
ramda-cli Manipulate JSON with the Ramda functional library, and either LiveScript or JavaScript syntax.
RecordStream Create, manipulate, and output a stream of records, or JSON objects. Can retrieve records from an SQL database, MongoDB, Atom feeds, XML, and other sources.
rq Convert between Apache Avro, CBOR, CSV, JSON, MessagePack, Protocol Buffers, TOML, YAML, and Awk-style plain text.
validjson Validate or pretty-print JSON.
VisiData Explore data interactively data. See the DSV/Other tools section.

XML, HTML

Name and link Description
dasel Supports XML. See the JSON section.
Graphtage See the JSON section.
hred Query XML and HTML with a query language based on CSS selectors.
html-xml-utils A number of simple utilities (like hxcopy, hxpipe, hxunent, hxselect) for manipulating HTML and XML files from W3C. Written in C, quite old-fashioned, but still relevant and maintained.
Mario Supports XML. See the JSON section.
pup Query HTML pages with CSS selectors. Static binaries available for releases. Inspired by jq.
Saxon Query XML and HTML data with XPath. Documentation.
sml2 Convert between XML and SML, a simplified XML representation.
Temme Query HTML with CSS-like selectors to extract JSON. Temme extends CSS selectors with value capture patterns.
tidy-html5 Validate, fix, and reformat HTML(5), XHTML, and XML documents. Convert HTML to XHTML.
tq Query HTML with CSS selectors.
Xidel Query or modify XML and HTML pages with XPath, XQuery 3, and CSS selectors.
xml-to-json-fast Convert XML to JSON. Can handle very large XML files.
xml2 Convert XML and HTML to and from flat, greppable lists of "path=value" statements. Source code mirror.
xmljson Convert multiple and large XML files to JSON. Written in Swift.
XMLLint Query (including XSLT), validate and reformat XML documents.
XMLStarlet Query, modify, and validate XML documents.
xq jq wrapper for XML documents.
xsltproc Transform XML documents using XSLT and EXSLT.

See also

YAML, TOML

With a format converter like Remarshal (below) you can use JSON tools to process YAML and TOML, but make sure you do not lose data in the conversion.

Name and link Description
clconf Merge multiple config files and extract values from them using path string. Supports JSON and YAML. Can be used as a Go library.
dasel Supports TOML and YAML. See the JSON section.
gojq Supports YAML. See the JSON section.
Graphtage Supports YAML. See the JSON section.
Mario Supports YAML. See the JSON section.
Remarshal Convert between CBOR, JSON, MessagePack, TOML, and YAML. Validate each of the formats. Pretty-print JSON, TOML, and YAML.
rq Supports TOML and YAML. See the JSON section.
shyaml Query YAML. Can output null-terminated strings for use in shell scripts.
validtoml Validate TOML.
validyaml Validate or pretty-print YAML.
yaml-tools A set of CLI tools to manipulate YAML files (merge, delete, etc...) with comment preservation, based on ruamel.yaml.
yq (kislyuk) jq wrapper for YAML.
yq (mikefarah) Query, modify, and merge YAML. Convert to and from JSON.

Configuration files

/etc/hosts

Name and link Description
hostctl Add and remove entires in /etc/hosts. Disable (comment out) and enable (uncomment) entires. Not idempotent. Preserves arbitrary comments above its section of the hosts file. Works with groups of entries called "profiles".
hostess Add and remove entires in /etc/hosts. Disable (comment out) and enable (uncomment) entires. Check if a hostname exists. Reformat the hosts file. Convert the entries to JSON. Idempotent. Removes arbitrary comments.
hosts Add and remove entires in /etc/hosts. Change a hostname's IP address. Idempotent. Preserves arbitrary comments. Can be used as a Tcl library.

INI

Name and link Platform License Description
cfget Any with Python 2.x? GNU GPLv2+ Retrieve properties as shell script commands to set the corresponding variables (with --dump exports). Retrieve properties' values as plain text. Substitute values from an INI file in an Autoconf-style template. Supports plug-ins. Chokes on section names and keys with spaces.
confget Linux, FreeBSD Two-clause BSD Retrieve properties and sections as shell script commands to set the corresponding variables. Retrieve properties' values as plain text. Check for existence of properties. List sections. Find values that match a pattern. Read-only.
crudini Any with Python 2.x GNU GPLv2 Retrieve properties and sections as INI fragments or shell script commands to set the corresponding variables. Retrieve properties' values as plain text. Set properties. Remove properties and sections. Create empty sections. Merge INI files. Changes files in place.
inicomp Windows, *nix Apache 2.0 Compare INI (and also Windows .reg) files.
IniFile (DOS version) Windows (x86, x86-64), MS-DOS Closed-source freeware Retrieve properties and sections as batch file commands to set the corresponding variables. Set properties. Remove properties and sections. Changes files in place.
initool Linux, FreeBSD, Windows MIT Retrieve properties and sections as INI fragments. Retrieve properties' values as plain text. Set properties. Check for existence of properties and sections. Remove properties and sections. Outputs the updated INI file.

Multiple formats

Name and link Description
Augeas Query and modify a number of file formats. Not all of the formats are equally well supported by Augeas and for some only a limited subset of all valid files can be parsed.
Elektra Query and modify configuration files. Shares Augeas' limitations when it comes to application-specific configuration files (it uses the same lenses), but has better support for generic formats such as JSON and INI.

Log files

Name and link Description
Squawk Query Apache and Nginx log files. See the SQL-based tool comparison.
lnav Query and watch log files. Has batch and interactive mode. Supported formats include the Common Log Format, CUPS page_log, syslog, strace, and generic timestamped messages. Can perform SQL queries.

Templating for structured text

Listed below are restricted programming language interpreters and templating tools that produce structured text output. They are generally intended to remove repetition in configuration files. They are distinct from unstructed templating tools like the jinja2 CLI program, which should not be added to this table.

Name and link Output format Turing-complete? Syntax I/O Description
CUE JSON Yes? Extended JSON ? A constraint language for JSON configuration data. Can generate and validates JSON.
Dhall JSON, YAML No Haskell-inspired Limited to importing libraries from files and HTTP(S) URLs (with protection against leaking your data to the server) A statically-typed functional configuration language. Has a standard formatting tool.
jk JSON, YAML, plain text Yes JavaScript Disk I/O Generate configuration files using JavaScript (V8 VM).
Jsonnet JSON, INI, XML, YAML, plain text Yes Extended JSON None A functional configuration language. Has a standard formatting tool.
rjsone JSON, YAML No? Extended JSON None A CLI tool for the JSON-e templating language.
ytt YAML No YAML/Python hybrid None? A templating tool for YAML built upon the Starlark configuration language.

Bonus round: CLIs for single-file databases

Name and link Description File format
Firebird Firebird is a FOSS database that can be used from a single file, like SQLite. "isql is a program that allows the user to issue arbitrary SQL commands". Binary
Fsdb A flat-file database for shell scripting. Text-based, TSV with a header or "key: value"
GNU Recutils "[A] set of tools and libraries to access human-editable, plain text databases called recfiles." Text-based, roughly "key: value"
SDB "[A] simple string key/value database based on djb's cdb disk storage and supports JSON and arrays introspection." Binary
sqlite3(1) "[A] simple command-line utility [...] that allows the user to manually enter and execute SQL statements against an SQLite database." Binary

License

The contents of this document is licensed under the Creative Commons Attribution 4.0 International License. By contributing you agree to release your contribution under this license.

Disclosure

csv2html, hosts, Sqawk, jsonwatch, Remarshal and initool are developed by the curator of this document.

  •   1. Comparison 1.1 文件比较 1.2 文档比较 2. 文件比较工具比较 2.1 免费跨平台的文件比较工具 2.2 其他文件比较工具 3. 更多相关   1. Comparison 1.1 文件比较 https://en.wikipedia.org/wiki/File_comparison 文件比较是数据对象(通常是文本文件,例如源代码)之间差异和相似性的计算和显

 相关资料
  • structured-filter 是通用的 Web UI,可以构建结构化搜索或者过滤查询。structured-filter 可以让用户构建类似 "Contacts where Firstname starts with 'A' and Birthday after 1/1/1980 and State in (CA, NY, FL)" 的过滤。structured-filter 是 jQuer

  • Structured-Light Depth Acquisition Matlab Implementation of a 3D Reconstruction algorithm, based High Accuracy Stereo Depth Maps Using Structured Light, by Daniel Scharstein and Richard Szeliski. The

  • 一、创建DataFrame和Dataset 1.1 创建DataFrame Spark 中所有功能的入口点是 SparkSession,可以使用 SparkSession.builder() 创建。创建后应用程序就可以从现有 RDD,Hive 表或 Spark 数据源创建 DataFrame。示例如下: val spark = SparkSession.builder().appName("Spa

  • 描述 Text 用于显示文本,在 Web 容器中是使用 span 标签实现的,而不是 p 标签。 安装 $ npm install rax-text --save 属性 属性 类型 默认值 必填 描述 支持 numberOfLines Number 1 ✘ 行数 注:基础属性、事件及图片含义见组件概述。 示例 基本用法 import Text from 'rax-text'; function

  • 简介 <text> 是 Weex 内置的组件,用来将文本按照指定的样式渲染出来. WARNING <text> 不支持子组件。 TIP <text> 里直接写文本头尾空白会被过滤,如果需要保留头尾空白字符,暂时只能通过数据绑定的方式,见下面动态文本。 样式 支持 通用样式。 支持 文本样式。 属性 除了动态文本,text组件不支持其他属性。 动态文本 下列代码片段可以实现文字内容和JS变量的绑定。

  • 描述 (Description) html( val )方法获取所有匹配元素的组合文本内容。 结果是一个字符串,其中包含所有匹配元素的组合文本内容。 此方法适用于HTML和XML文档。 语法 (Syntax) 以下是使用此方法的简单语法 - <i>selector</i>.text( ) 参数 (Parameters) 以下是此方法使用的所有参数的说明 - NA 例子 (Example) 以下是