大象教程
首页
Spark
Hadoop
HDFS
MapReduce
Hive
Pig 教程
Pig 教程
Pig 体系结构
Pig 安装
Pig 执行
Pig Grunt Shell
Pig Latin 基础
Pig 读取数据
Pig 存储数据
Pig Dump 运算符
Pig Describe 运算符
Pig Explain 运算符
Pig illustrate 运算符
Pig GROUP 运算符
Pig Cogroup 运算符
Pig JOIN 运算符
Pig Cross 运算符
Pig Union 运算符
Pig SPLIT 运算符
Pig FILTER 运算符
Pig DISTINCT 运算符
Pig FOREACH 运算符
Pig ORDER BY 运算符
Pig LIMIT 运算符
Pig eval(求值) 函数
Pig Load & Store 函数
Pig Bag & Tuple 函数
Pig 字符串(String) 函数
Pig 日期时间函数
Pig 数学函数
#Pig 存储数据 在上一章中,我们学习了如何将数据加载到Apache Pig中。您可以使用store运算符将加载的数据存储在文件系统中。本章介绍如何使用Store运算符在Apache Pig中存储数据。 **语法** 下面给出了Store语句的语法。 ```bash STORE Relation_name INTO ' required_directory_path ' [USING function]; ``` 假设我们在HDFS中有一个具有以下内容的文件Student_data.txt ``` 001,Rajiv,Reddy,9848022337,Hyderabad 002,siddarth,Battacharya,9848022338,Kolkata 003,Rajesh,Khanna,9848022339,Delhi 004,Preethi,Agarwal,9848022330,Pune 005,Trupthi,Mohanthy,9848022336,Bhuwaneshwar 006,Archana,Mishra,9848022335,Chennai ``` 如下所示,我们已使用LOAD运算符将其读入关系student。 ```bash grunt> student = LOAD 'hdfs://localhost:9000/Pig_Data/student_data.txt' USING PigStorage(',') as ( id:int, firstname:chararray, lastname:chararray, phone:chararray,city:chararray ); ``` 现在,让我们将关系存储在HDFS目录“/pig_Output/”中,如下所示。 ```bash grunt> STORE student INTO 'hdfs://localhost:9000/Pig_Output/' USING PigStorage (','); ``` 执行store语句后,您将获得以下输出。使用指定名称创建目录,数据将存储在其中。 ```bash 2015-10-05 13:05:05,429 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer. MapReduceLau ncher - 100% complete 2015-10-05 13:05:05,429 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics: HadoopVersion PigVersion UserId StartedAt FinishedAt Features 2.6.0 0.15.0 Hadoop 2015-10-0 13:03:03 2015-10-05 13:05:05 UNKNOWN Success! Job Stats (time in seconds): JobId Maps Reduces MaxMapTime MinMapTime AvgMapTime MedianMapTime job_14459_06 1 0 n/a n/a n/a n/a MaxReduceTime MinReduceTime AvgReduceTime MedianReducetime Alias Feature 0 0 0 0 student MAP_ONLY OutPut folder hdfs://localhost:9000/pig_Output/ Input(s): Successfully read 0 records from: "hdfs://localhost:9000/pig_data/student_data.txt" Output(s): Successfully stored 0 records in: "hdfs://localhost:9000/pig_Output" Counters: Total records written : 0 Total bytes written : 0 Spillable Memory Manager spill count : 0 Total bags proactively spilled: 0 Total records proactively spilled: 0 Job DAG: job_1443519499159_0006 2015-10-05 13:06:06,192 [main] INFO org.apache.pig.backend.hadoop.executionengine .mapReduceLayer.MapReduceLau ncher - Success! ``` 您可以如下所示验证存储的数据。 第1步: 使用ls命令列出目录Pig_output中的文件,如下所示。 ```bash hdfs dfs -ls 'hdfs://localhost:9000/pig_Output/' Found 2 items rw-r--r- 1 Hadoop supergroup 0 2015-10-05 13:03 hdfs://localhost:9000/pig_Output/_SUCCESS rw-r--r- 1 Hadoop supergroup 224 2015-10-05 13:03 hdfs://localhost:9000/pig_Output/part-m-00000 ``` 您可以观察到在执行store语句后创建了两个文件。 第2步: 使用cat命令,列出名为part-m-00000的文件的内容,如下所示。 ```bash $ hdfs dfs -cat 'hdfs://localhost:9000/pig_Output/part-m-00000' 1,Rajiv,Reddy,9848022337,Hyderabad 2,siddarth,Battacharya,9848022338,Kolkata 3,Rajesh,Khanna,9848022339,Delhi 4,Preethi,Agarwal,9848022330,Pune 5,Trupthi,Mohanthy,9848022336,Bhuwaneshwar 6,Archana,Mishra,9848022335,Chennai ```
加我微信交流吧