1.7 billion #JSON objects: #dataset of every publicly available comment on #Reddit.
▻https://www.reddit.com/r/datasets/comments/3bxlg7/i_have_every_publicly_available_reddit_comment
1.7 billion #JSON objects: #dataset of every publicly available comment on #Reddit.
▻https://www.reddit.com/r/datasets/comments/3bxlg7/i_have_every_publicly_available_reddit_comment
diskDB by arvindr21
▻http://arvindr21.github.io/diskDB
A Lightweight Disk based JSON Database with a MongoDB like API for Node.
You will never know that you are interacting with a File System
Datalib: JavaScript Data Utilities
▻http://uwdata.github.io/datalib
Datalib is a JavaScript data utility library. It provides facilities for data loading, type inference, common statistics, and string templates. This includes:
Loading and parsing data files (JSON, TopoJSON, CSV, TSV).
Summary statistics (mean, deviation, median, correlation, histograms, etc.).
Group-by aggregation queries, including streaming data support.
Data-driven string templates with expressive formatting filters.
Utilities for working with JavaScript functions, objects and arrays.
While created to power Vega and related projects, datalib is a standalone library useful for data-driven JavaScript applications on both the client (web browser) and server (e.g., node.js).
typicode/lowdb
▻https://github.com/typicode/lowdb
Flat JSON file database built on lodash API
Good choice for quick development DB.
mapbox/geobuf · GitHub
▻https://github.com/mapbox/geobuf
Geobuf is a compact binary encoding for geographic data. #geobuf provides lossless compression of #geojson and #topojson data into protocol buffers. Advantages over using JSON-based formats alone: - Very compact: typically makes GeoJSON 6-8 times smaller and TopoJSON 2-3 times smaller. - Smaller even when comparing gzipped sizes: 2-2.5x compression for GeoJSON and 20-30% for TopoJSON. - Very fast encoding and decoding — even faster than native JSON parse/stringify. - Can accommodate any GeoJSON and TopoJSON data, including extensions with arbitrary properties. The encoding format also potentially allows: - Easy incremental parsing — get features out as you read them, without the need to build in-memory representation of the whole data. - Partial reads — read only the parts you actually (...)
#map
JSONx is an IBM standard format to represent JSON as XML
via ▻http://devopsreactions.tumblr.com/post/112124376421/jsonx-is-an-ibm-standard-format-to-represent-json
HTTPie : a CLI, cURL-like tool for humans
▻https://github.com/jakubroztocil/httpie
HTTPie (pronounced aych-tee-tee-pie) is a command line #HTTP client. Its goal is to make #CLI interaction with web services as human-friendly as possible. It provides a simple http command that allows for sending arbitrary HTTP requests using a simple and natural syntax, and displays colorized responses. HTTPie can be used for testing, debugging, and generally interacting with HTTP servers.
c’est fait en #python, mais ça s’installe avec #homebrew et c’est supposé remplacer #curl
Même style d’outil, mais avec une interface graphique :
#rest
Perso, j’utilise encore curl+#jq+#xmlstarlet pour faire des scripts en #bash jetables
toujours pas regarder xml2json
cujojs/jiff
▻https://github.com/cujojs/jiff
JSON Patch and diff based on rfc6902
Giles Bowkett: Why Panda Strike Wrote the Fastest #JSON #Schema Validator for #Node.js
▻http://gilesbowkett.blogspot.co.uk/2015/01/why-panda-strike-wrote-fastest-json.html
mapbox/geojson-vt
▻https://github.com/mapbox/geojson-vt
Slice GeoJSON into vector tiles on the fly in the browser A highly efficient JavaScript library for slicing GeoJSON data into vector tiles on the fly, primarily designed to enable rendering and interacting with large geospatial datasets on the browser side (without a server). Resulting tiles conform to the JSON equivalent of the vector tile specification. To make data rendering and interaction fast, the tiles are simplified, retaining the minimum level of detail appropriate for each zoom level (simplifying shapes, filtering out tiny polygons and polylines). Here’s geojson-vt action in Mapbox GL JS, dynamically loading a 100Mb US zip codes GeoJSON with 5.4 million points:
The problem of managing schemas - O’Reilly Radar
▻http://radar.oreilly.com/2014/11/the-problem-of-managing-schemas.html
with #CSV and #JSON data, the data has a schema, but the schema isn’t stored with the data. For example, CSV files have columns, and those columns have meaning. They represent IDs, names, phone numbers, etc. Each of these columns also has a data type: they can represent integers, strings, or dates. There are also some constraints involved — you can dictate that some of those columns contain unique values or that others will never contain nulls. All this information exists in the head of the people managing the data, but it doesn’t exist in the data itself.
The people who work with the data don’t just know about the schema; they need to use this knowledge when processing and analyzing the data. So the schema we never admitted to having is now coded in Python and Pig, Java and R, and every other application or script written to access the #data.
solution: #AVRO Apache ▻http://avro.apache.org
cc @lazuly
Le principal, c’est que le CSV contienne sa ligne de header avec des noms explicites pour chacune des colonnes (rien de pire qu’un CSV sans nom de colonnes...).
Par contre, pour les data types, il suffit d’utiliser « csvstat » (sur le premier million de lignes, si le fichier est énorme) pour avoir une vision très claire des données (nombre de valeurs uniques, moyenne, médiane, présence de champs nuls ou non...).
format universel de #métadonnées (les vraies…)
vieux #serpent_de_mer !
torodb/torodb
▻https://github.com/torodb/torodb
ToroDB is an open source, document-oriented, JSON database that runs on top of PostgreSQL. JSON documents are stored relationally, not as a blob/jsonb. This leads to significant storage and I/O savings. It speaks natively the MongoDB protocol, meaning that it can be used with any mongo-compatible client.
jdorn/json-editor
▻https://github.com/jdorn/json-editor
JSON Editor takes a JSON Schema and uses it to generate an HTML form.
It has full support for JSON Schema version 3 and 4 and can integrate with several popular CSS frameworks (bootstrap, foundation, and jQueryUI).
The best of two worlds ?
OpenID Connect als Standard ratifiziert
▻http://www.heise.de/open/meldung/OpenID-Connect-als-Standard-ratifiziert-2126073.html
Der von Unternehmen wie Google, Microsoft, Deutsche Telekom und Salesforce.com ausgearbeitete Standard soll über kurz oder lang OpenID 2.0 im Web ablösen – auch dank der ungemeinen Popularität von OAuth.
OpenID Connect FAQ and Q&As | OpenID
▻http://openid.net/connect/faq
OpenID Connect is an interoperable authentication protocol based on the OAuth 2.0 family of specifications. It uses straightforward REST/JSON message flows with a design goal of “making simple things simple and complicated things possible”. It’s uniquely easy for developers to integrate, compared to any preceding Identity protocol.
OpenID Connect lets developers authenticate their users across websites and apps without having to own and manage password files. For the app builder, it provides a secure verifiable, answer to the question: “What is the identity of the person currently using the browser or native app that is connected to me?”
OpenID Connect allows for clients of all types, including browser-based JavaScript and native mobile apps, to launch sign-in flows and receive verifiable assertions about the identity of signed-in users.
▻http://www.vevo.com/watch/hannah-montana/the-best-of-both-worlds/USWV20620226
;-)
JSON Mail Access Protocol Specification (JMAP)
▻http://jmap.io
JMAP is a new JSON-based API for synchronising a mail client with a mail server. It is intended as a replacement for IMAP. The specification is based on the API currently used by FastMail’s web app. It aims to be compatible with the IMAP data model, so that it can be easily implemented on a server that currently supports IMAP, but allows for reduced data usage and more efficient synchronisation, bundling of requests for latency mitigation and is generally much easier to work with than IMAP.
Pour info,
Josh Begley
Data artist & web developer
▻http://joshbegley.com
The Chilling Geometry of Every US Military Base Seen From Space
▻http://gizmodo.com/the-chilling-geometry-of-every-us-military-base-seen-fr-1481870788
Liste de toutes les frappes de drones recensées
(données en ligne, #JSON)
▻http://dronestre.am
The lie of the API | Ruben Verborgh
▻http://ruben.verborgh.org/blog/2013/11/29/the-lie-of-the-api
Futuristic? It’s not: it works already, and it’s really simple. Here is a designer chair from the Cooper-Hewitt museum:
▻http://collection.cooperhewitt.org/objects/35460799
The cool thing is that machine clients use the same URL to access a JSON version:
curl ▻http://collection.cooperhewitt.org/objects/35460799 -H “Accept: text/html”
curl ▻http://collection.cooperhewitt.org/objects/35460799 -H “Accept: application/json”
Not only does this enable to share URLs between different parties, it also makes access really simple. I don’t have to read the manual. Instead, I just use the same interface I use every day: the URL. Works the same way everywhere.
This technique is called content negotiation and it is a characteristic of REST APIs.
Voui voui voui. Chouette article qui met la pile aux gens qui compliquent à loisir.
#API #REST #content-negotiation #mime-type #HTTP #URL
Mais souvent pour faciliter les clients, c’est plus facile de leur faire passer des paramètres en get ou post que de leur demander de savoir gérer les entêtes HTTP. Enfin j’avais l’impression. (Moi-même étant assez nul en connaissance d’entêtes HTTP au passage.)
URLs identify concepts, and each URL can have multiple representations. This means that a single resource can be identified with one URL for all clients . Each client just indicates to the server whether it wants HTML or JSON or something else, and the server replies with a representation the client understands.
This lack of knowledge comes from developers being all too familiar with the programming-oriented environment they usually work with, but mostly oblivious about the resource-oriented nature of the Web.
The Web is an information space , not a programming space.
I imagine that developers were approached with the question “can you build an API?” And this is what they did.
But the question was wrong. It should have been: “ can you add machine access ?”
The #JSON data interchange format is now an #ECMA standard :
▻http://www.ecma-international.org/publications/files/ECMA-ST/ECMA-404.pdf
marianoguerra/#json.human.js
▻https://github.com/marianoguerra/json.human.js
Convert JSON to human readable HTML
Data Science Toolkit
▻http://www.datasciencetoolkit.org
A collection of the best open data sets and open-source tools for data science, wrapped in an easy-to-use REST/JSON API with command line, Python and Javascript interfaces. Available as a self-contained Vagrant VM or EC2 AMI that you can deploy yourself. It’s essentially a specialized Linux distribution, with a lot of useful data software pre-installed and exposing a simple interface. For full documentation, see (...)
dscape/clarinet · GitHub
▻https://github.com/dscape/clarinet
SAX based evented streaming JSON parser in JavaScript
olivernn/lunr.js · GitHub
▻https://github.com/olivernn/lunr.js
Lunr.js is a small, full-text search library for use in the browser. It indexes JSON documents and provides a simple search interface for retrieving documents that best match text queries.
JSON Web Token (JWT)
▻http://self-issued.info/docs/draft-ietf-oauth-json-web-token.html
JSON Web Token (JWT) is a compact URL-safe means of representing claims to be transferred between two parties. The claims in a JWT are encoded as a JavaScript Object Notation (JSON) object that is used as the payload of a JSON Web Signature (JWS) structure or as the plaintext of a JSON Web Encryption (JWE) structure, enabling the claims to be digitally signed or MACed and/or encrypted.
Si je comprends bien, c’est pour envoyer des données entre deux services, ou un serveur et un client, de manière sécurisée, mais sur #http ?
J’ai pas complétement fouillé (je le seen pour y revenir) mais grosso modo c’est une façon d’échanger un token sécurisé (qui vient vérifiablement d’une partie) en utilisant JSON. Ça sert notamment à faire du proof-of-purchase.
Sphinx 2.1 : JSON Attributes
▻http://sphinxsearch.com/blog/2013/02/07/sphinx-2-1-json-attributes
We’re delighted to announce that Sphinx 2.1 begins support of JSON attributes. While complete support is yet to come (some quirks and limitations are yet to be ironed out), we consider this to be a major step ahead. Storing sparse key-value data is no longer a fundamental issue in Sphinx …
Source: Sphinx - adrian
Full support in trunk:
▻http://sphinxsearch.com/blog/2013/08/08/full-json-support-in-trunk
#Lobbies on dataprotection - Wiki #veille de La Quadrature du Net
▻http://www.laquadrature.net/wiki/Lobbies_on_dataprotection
This page lists the different lobbies’s documents calling for an extensive definition of personal #data, upon the adoption process of the European Commission’s Proposal for a General Data Protection Regulation.
#données_personnelles #législation #Europe
Et pour suivre les activités du Parlement européen : ►http://parltrack.euwiki.org
Parltrack is a European initiative to improve the transparency of legislative processes. It combines dossiers, MEPs, vote results and committee agendas into a unique database and allows the #tracking of dossiers using email and RSS. Most of the data displayed is also available for further processing in JSON format. Using Parltrack it’s easy to see at a glance which dossiers are being handled by committees and MEPs.
(liens vus sur la liste de diffusion de la Quadrature) #open_gov #transparence