IJRIT International Journal of Research in Information Technology, Volume 2, Issue 5, May 2014, Pg: 593-600
International Journal of Research in Information Technology (IJRIT) www.ijrit.com
ISSN 2001-5569
Optimized Method for Bulk Data Transfer on Internet 1
Varsha V.Sagare1, S.M.Chaware2 Student, Department of computer Engg, University of pune pune, Maharashtra, India
[email protected]
1
Doctor, Department of computer Engg, University of pune pune, Maharashtra, India
[email protected]
Abstract Large datacenter operators with sites at multiple locations dimension their key resources according to the peak demand of the geographic area that each site covers. There has been renewed interest in the problem of transferring bulk data (terabyte), use commercial ISPs. Variety of routing algorithm use middle server to transfer bulk data. This arises as issue like load on single server, speed, transit cost. This paper has been proposed new routing algorithmic technique to overcome this problem. The main approach is to increase transmission speed with minimum transit cost. Comparisons with previous approaches show the superior performance in terms of cost and speed.
Keywords: ISP, Snf (store n Forward), Data center, Bulk data, E2E(End to End), MP (Multipart).
1. Introduction Amazon, Face-book, Google, Microsoft, and Yahoo! These are the Online service companies have made huge investments in networks of datacenters. They transfer TB (Tera-byte) of data on daily basis. Similarly, hosting and co-location services such as Equinox and Savvies employ distributed networks of datacenters that include tens of locations across the globe [1]. The growth of social networking has impact on how we interact, share and consume information. Most of multimedia data is transfer through network across the world. Large data transfer through network will affect on its performance. So result is delay on data transfer .delay-tolerant (DT) has opened the possibility of offering bulk downloads as a service that the ISPs can offer. ISPs used for routing and forwarding packets. But correspond ISPs can enable a variety of services for bulk-data transfers both for consumers and for businesses. For example Amazon provides a service, which allows a user to transfer large amount of data across the different area throughout the world through theirs private Non-military network. For data transfer internal network gives minimum cost as compare to internet. There is a demand for service like Netflix (next generation service), to download movies from user side Netflix queue to Xbox [18]. In[2], Growth of using social networking sites has meant that companies need to synchronize their data warehouses both within the country and across the world. News organizations need to move multimedia content to their web-servers across the country. Data may be accessed from the closest server. Deal with scientific community means moving large data sets need more reliable, quick and in a cost effective manner. Recently deals with two police to download multimedia data given as:-
Varsha V.Sagare
,IJRIT
593
IJRIT International Journal of Research in Information Technology, Volume 2, Issue 5, May 2014, Pg: 593-600
1.1 E2E policy In connection oriented Bulk data transfer through network. DTB data flow from a sender to a receiver in End-to-End (E2E) Transfer policy. E2E transfer consider bandwidth calculate from constant bit rate B deliver within time T in second. The problem with E2E policy is due to time-zone or traffic load.Suppose during certain time, off-peak hours of the sending ISP coincide with off-peak hours of the receiving ISP. Efficiency goes decreases when Residential request coincide with corporate request during Off-peak hour. 1.2 SNF policy Store-and-Forward (SNF) policy deals with problem raised using E2E policy, by using storage node between transit provider and receiver. Storage node contains replicas of transit provider, by using SNF policy to minimized load on ISP. Using On-peak & off peak value calculate transit cost. Our approach is to increase transmission speed with minimum transit cost. And compare this technique with previous to show the superior performance of the new technique
2. Literature Review These are the following parameters to be considered during the literature survey 2.1 BULK DATA TRANSFER POLICIES
Increasing speed [3], had presented a Markov-chain TCP delay model for CBR-TCP flows. The model captures the behaviour of VoIP and streaming flows. The delay performance of a video flow can be improved using packet splitting or parallel connection heuristics. [4], study of the various routing protocols, proposed for DTN and classified them. It is not possible to classify each of the schemes into exactly one of the many classes. Most approaches are hybrid in nature and may fall into more than one category. Attempted to classify the various schemes based on the type of knowledge used by the routing protocol. Design and implementation of DOT, a flexible architecture for data transfer. This architecture separates content negotiation from the data transfer itself. Applications determine what data they need to send and then use a new transfer service to send it. This transfer service acts as a common interface between applications and the lower-level network layers, facilitating innovation both above and below. The transfer service frees developers from re-inventing transfer mechanisms in each new application. New transfer mechanisms, in turn, can be easily deployed without modifying existing applications.[5]
Transit cost of ISP The general problem of finding a cost-optimal transfer of the bulk data can be solved in polynomial-time using minimum cost flow algorithms on a time-expanded version of the underlying network has been explained [6]. Evaluate the feasibility of a solution where the ISP offers an “oracle” to the P2P users. When the P2P user supplies the oracle with a list of possible P2P neighbors, the oracle ranks them according to certain criteria, like their proximity to the user or higher bandwidth links has been explained [7]. Finding the common ISP practice of structuring tiered contracts according to the cost of carrying the traffic flows (e.g., offering a discount for traffic that is local) is suboptimal. Dividing the contract into only three or four tiers based on both traffic cost and demand yields near-optimal profit for the ISP; other strategies such as cost division bundling also work well.[8]
Varsha V.Sagare
,IJRIT
594
IJRIT International Journal of Research in Information Technology, Volume 2, Issue 5, May 2014, Pg: 593-600
1.2 CURRENT SYSTEM: Problem statement For transferring large amount DTB data, transit costs incurred are very high, charged under 95-percentile pricing scheme, also resulting into negative impacts on the Qos of inter-active traffic.
Requirements • •
Under 95-percentile pricing. avoid negative impacts on the QoS of inter-active traffic.
Existing solution Existing three solutions are available for transfer terabytes of bulk data transfer on internet and avoid additional transit cost and Qos of interactive traffic [11, 12, 13].To build own dedicated network between sender and receiver. User to transfer regular 27 TB of data around world through network. • By using postal services or courier services to transfer data from sender to receiver. • By using commercial ISP sending the data using existing E2E policy like using ftp or http.
Existing method • End-to-End In this E2E method, source scheduling at the sender to regulate the amount of DTB traffic that is sent to the received at each 5-minute slot over an end-to-end connection. E2E transfer consider bandwidth calculate from constant bit rate B deliver within time T in second [15].
• Store-and-Forward Next we consider a store-and-forward method that first uploads data from the sender v to the transit storage node w within TR, and then pushes them from w towards the final receiver u. As a result, SnF has much more freedom than E2E-method. The problem with SnF method is load single server, transmission speed & transit cost [17].
3. Proposed method 3.1 Design In this system, design one new routing prototype. This system divided into three modules:
1. Req
3 data
Rep De-fragmentation
1. Req
R E P
Data centre
Varsha V.Sagare
,IJRIT
DC Data Base
R E P
SnF
SnF Database
595
IJRIT International Journal of Research in Information Technology, Volume 2, Issue 5, May 2014, Pg: 593-600
Synchronizatio
Fig. 1 Proposed system architecture
3.2 Implementation Steps • • • • •
Client send request to data centre for data. Data centre and SnF are synchronised to each other. SnF contain replicas of data centre. Both of having connected to different database Requested data is divided into multipart at DC side. Each part has an ID. Simultaneously data centre and SnF send this part one by one to client. At client side, fragmented data part combined and get original file.
These are the following objective try to achieve in this system:
• Increasing speed for better efficiency: Suppose a client send request, this request send to data center. SNF server contains replica from datacenter. The both servers send a fragmented data part simultaneously. User side combines fragmented data with better efficiency. SNF and data centre data transfer using E2E algorithm.
• De- fragment data In this module, Combine all fragmented data at client side with minimum time and also display fragmented data as single data. Here use buffer which contains fragmented data in single buffer & each part having different Id is provided by sender and this will sorted at the receiver side.
• Use off-peak hours for minimizing cost The 95-percentile pricing scheme:- Let x denote a time series containing 5-minute transfer volumes between a customer ISP and a transit provider ISP in the duration of a charging period, typically a month. The customer pays the transit provider an amount given by the charging function c (x) that takes as input the charged volume q(x), defined to be the 95-percentile value of x. To combine diurnal variation pattern algorithm & 95 percentile scheme that will provide as to transfer data by using off-peak with reduced cost. The diurnal variation pattern is used to calculate on peak or off peak time for data transfer.
3.3 Implementation Strategy •
Proposed algorithm for increasing speed using Multipart method
Algorithm 1: Multipart (Fs,P)F: Variable Fs=Filename ; P=Peak on-off ,F=File St=start time. Et=end time Tt=total time required Dr=download prize Fc=file count. Input : Fs, P Output : F
Varsha V.Sagare
,IJRIT
596
IJRIT International Journal of Research in Information Technology, Volume 2, Issue 5, May 2014, Pg: 593-600
If peak on-off condition is true then do Fc=filelength; Totallength+=filelength; Return Fs; While filelength not equals to Fc Return St,Et,Tt,Dr Else Display File download from snf; Connect to snf; End
•
Proposed algorithm connect client to server
Algorithm 2: connectsnf (Fs,P)F: Variable Fs=Filename ; P=Peakonoff ,F=File St=start time. Et=end time Tt=total time required Dr=download prize Fc=file count. Input: Fs, P. Output : F. If peak on-off condition is true Then do Fc=filelength; Totallength+=filelength; Return Fs; While filelength not equals to Fc Return St,Et,Tt,Dr; Else Wait for peak off End
Example: File size: 5.13MB (audio file) File size: 2MB No. of Parts: 3. Total time=Start time-end time; =271Ms Cost= total data transfer (MB)*prize Cost=1024*1024*2 Cost= (6391456/1024*1024)*peak on cost (Mb) Cost= (6391456/1024*1024)*20 (Mb) Cost=120;
Varsha V.Sagare
,IJRIT
597
IJRIT International Journal of Research in Information Technology, Volume 2, Issue 5, May 2014, Pg: 593-600
4. Comparative result analysis of system: 4.1 Existing System: In existing system, several important scientific and industrial applications require exchanging delaytolerant Bulk (DTB) data. Depending on the application, DTB data are currently being serviced by using expensive dedicated network like the LHC Computing Grid using E2E method.
Drawback
• •
In existing system, due to load on single server, transmission speed is less than required. Data have delay tolerances that range from several hours to a few days. The Scavenger service of Qbone has limitation is that it protects the QoS of interactive traffic, but cannot protect against high transit costs or meet specific deadlines.
4.2 Implemented System: In current system have, several important scientific and industrial applications require exchanging delaytolerant Bulk (DTB) data. Depending on the application, Following results shows DTB data are currently being serviced by using multipart method. System analysis gives following result: Table 1,2 : Result of sending audio file to client Audio file
E2E method
MP method
Data size(Mb) Time required to transfer data(Ms)
6.19
6.19
909
359
Cost(Rs)
180
160
Audio file
E2E method
MP method
Data size(Mb) Time required to transfer data(Ms)
3.38
3.38
269
203
Cost(Rs)
94
80
Table 2: Result of sending zip file to client ZIP file
E2E method
MP method
Data size(Mb) Time required to transfer data(Ms)
3.5
3.5
269
188
Cost(Rs)
110
80
ZIP file
E2E method
MP method
Data size(Mb) Time required to transfer data(Ms)
3.46
3.46
245
203
Cost(Rs)
102
80
Graph 1: Result of time required to transfer data
Varsha V.Sagare
,IJRIT
598
IJRIT International Journal of Research in Information Technology, Volume 2, Issue 5, May 2014, Pg: 593-600
1000 800 600 time(E2E) 400
time(MP)
200 0 6.19
3.5
3.46
3.38
Graph 1: Result of cost required to transfer data ₹ 200.00
cost(Rs)E2E
₹ 150.00
cost(Rs)Mp
₹ 100.00 ₹ 50.00 ₹6.19
3.5
3.46
3.38
5. Conclusion Build the system to increase the efficiency of network sharing the bulk data without increasing load on servers. In this system we manage the pricing issue according to pick on/off of network. Also handling delay data transfer between server and multiple clients using SnF (store and forward) method. This system also manages the price based data transfer according to package provided by network provider (ISP).
6. References [1] Delay-Tolerant Bulk Data Transfers on the Internet Nikolaos Laoutaris, Georgios Smaragdakis, RadeStanojevic, Pablo Rodriguez, and Ravi Sundaram. IEEE TRANSACTIONS ON NETWORKING 2013. [2] iDTT: Delay Tolerant Data Transfer for P2P File Sharing Systems Cong Shi, Mostafa H. Ammar and Ellen W. Zegura College of Computing, Georgia Institute of Technology IEEE TRANSACTIONS ON NETWORKING 2011. [3] The Delay-Friendliness of TCP for Real-Time Traffic Eli Brosh, Salman Abdul Baset, Graduate Student Member, IEEE, Vishal Misra, Member, IEEE, Dan Rubenstein, Member, IEEE, and Henning Schulzrinne, Fellow, IEEE 2010 [4] Salman Abdul Baset, Graduate Student Member, IEEE, Vishal Misra, Member, IEEE, Dan Rubenstein, Member, IEEE, and Henning Schulzrinne, Fellow, IEEE 2010 [5] Routing Approaches in Delay Tolerant Networks: A Survey” R. J. D'Souza, Johny Jose 2010 International Journal of Computer Applications (0975 - 8887) Volume 1 – No. 17 [6] An Architecture for Internet Data Transfer Niraj Tolia, Michael Kaminsky, David G. Andersen†, Swapnil Patil Carnegie Mellon University, Intel Research Pittsburgh 2011 [7] “Algorithms for Constrained Bulk-transfer of Delay-Tolerant Data” Parminder Chhabra, Vijay Erramilli, Nikos Laoutaris, Ravi Sundaram, and Pablo Rodriguez 2009 [8] Vinay Aggarwal, Anja Feldmann, and Christian Scheideler ”. Can ISPs and P2P users cooperate for improved performance?” ACM SIGCOMM Computer Communication Review, 37(3):29–40, 2007 [9] How Many Tiers? Pricing in the Internet Transit Market Vytautas Valancius_, Cristian Lumezanu_, Nick Feamster, Ramesh Johari, and Vijay V. Vazirani Georgia Tech Stanford University [10] Inter-Datacenter Bulk Transfers with NetStitcher Nikolaos Laoutaris, Michael Sirivianos, Xiaoyuan Yang, and Pablo Rodriguez Telefonica Research Barcelona, Spain. [11] On Economic Heavy Hitters: Shapley value analysis of 95th-percentile pricing Rade Stanojevic Telefonica Research Nikolaos Laoutaris Telefonica Research Pablo Rodriguez Telefonica Research
Varsha V.Sagare
,IJRIT
599
IJRIT International Journal of Research in Information Technology, Volume 2, Issue 5, May 2014, Pg: 593-600
[12] “Implementing Delay Tolerant Networking” Michael Demmer, Eric Brewer, Kevin Fall, Sushant Jain, Melissa Hon, Rabin Patra [13] Eli Brosh, Salman Abdul Baset, Dan Rubenstein, and Henning Schulzrinne. The delay-friendliness of TCP In Proceedings of ACM SIGMETRICS ’08. [14] Joe Chabarek, Joel Sommers, Paul Barford, Cristian Estan, David Tsiang, and Steve Wright. Power Awareness in Network Design and Routing. InProceedings of IEEE INFOCOM ’08. [15] Costas Courcoubetisand Richard Weber Pricing CommunicationNetworks:Economics,Technology & Modelling . Wiley, 2003. [16] Amogh Dhamdhere and Constantinos Dovrolis. ISP and Egress Path Selection for Multihomed Networks. InProceedings of IEEE INFOCOM ’06. [17] Xbox live and Netflix. At http://www.xbox.com/en-US/live/netflix/default.html
Varsha V.Sagare
,IJRIT
600